Advanced Filter
Filter & SortFilter rows using one or more conditions combined with AND/OR logic and parenthesised grouping, keeping only the rows that match your criteria. This is the primary tool for subsetting data before analysis -- for example, selecting only a specific treatment arm, age range, or experimental condition.
When to Use
- You need to keep only rows matching specific criteria before running a statistical test (e.g., only female participants over age 30, or only samples from a particular batch).
- You want to combine multiple conditions with AND logic: for example, Age > 30 AND Group = "Treatment" AND Site = "London".
- You need complex logic with grouped conditions using OR: for example, (Region = "North" AND Sales > 100) OR (Region = "South" AND Sales > 200).
- You want to exclude outliers or invalid entries before analysis, such as removing rows where a quality score is below a threshold or where a required field is empty.
Required Inputs
- Column -- the column to filter on. You can add multiple filter rows to apply conditions on different columns simultaneously.
- Operator -- the comparison type. Available operators include: equals, not equals, greater than, greater than or equal, less than, less than or equal, contains, starts with, ends with, is empty, is not empty, and matches regex.
- Value -- the comparison value to test each row against. Not required for the "is empty" and "is not empty" operators. For numeric comparisons, enter a number; for text comparisons, enter a string.
What Changes
- Row count decreases because rows that do not match the filter conditions are removed from the dataset.
- Column structure is preserved -- no columns are added or removed. The result has the same columns as the input.
- The operation is applied in place, replacing the current data. Undo is available to revert to the previous state if the filter was too aggressive or applied incorrectly.
- Row indices are reset after filtering. If you had row numbers as a column, those values are preserved, but the grid row positions change.
Common Pitfalls
- Applying numeric comparisons (greater than, less than) to a text column compares values lexicographically, which produces incorrect results for numbers stored as strings (e.g., "9" > "80" because "9" > "8"). Convert the column to numeric type first.
- String matching is case-insensitive by default. Toggle the "Aa" button to enable case-sensitive matching when the distinction matters (e.g., filtering gene names like "TP53" vs "tp53").
- Empty and missing values (NA, N/A, blank cells, NaN) are all treated as empty by the "is empty" operator. Use "is not empty" to exclude them, or be aware that numeric filters silently skip empty cells.
- When using OR logic between groups, make sure the parenthesised grouping is correct. (A AND B) OR C is different from A AND (B OR C).
Example: Before & After
Before
| Name | Age | Group |
|---|---|---|
| Alice | 28 | Control |
| Bob | 35 | Treatment |
| Carol | 22 | Control |
| Dave | 41 | Treatment |
→
After
| Name | Age | Group |
|---|---|---|
| Bob | 35 | Treatment |
| Dave | 41 | Treatment |
Filter: Group = "Treatment". Only the two Treatment rows are retained; Control rows are removed.