Pivot Wider
ReshapeSpread a column's unique values into new column headers, converting long-format (stacked) data into wide-format (one row per subject). This is one of the most common reshape operations in biostatistics, often needed before running paired t-tests, repeated-measures ANOVA, or any analysis that expects each measurement in its own column.
When to Use
- Repeated measurements are stacked in rows (e.g., Pre and Post scores in the same column) and you need each measurement as its own column for a paired comparison.
- Each subject has multiple rows for different conditions or time points and you need exactly one row per subject for wide-format statistical tests such as paired t-tests or repeated-measures ANOVA.
- A "condition", "time", or "visit" column should become column headers so the data layout matches what the analysis expects. For example, a "Dose" column with values 10mg, 20mg, 50mg should become three separate columns.
- You are exporting data to a collaborator or software tool that requires wide-format input (e.g., certain GraphPad Prism layouts).
Required Inputs
- Names From -- the column whose unique values will become the new column headers. Each distinct value in this column creates one new column.
- Values From -- the column (or columns) whose cell values will fill the newly created columns. If multiple value columns are selected, column names are constructed by combining the name and value labels.
What Changes
- Row count decreases because rows sharing the same ID combination are collapsed into a single row. For example, if each subject has two rows (Pre and Post), the result will have one row per subject.
- Column count increases because each unique value in the "Names From" column becomes a new column in the output.
- The original "Names From" and "Values From" columns are consumed and replaced by the new wider structure. They no longer appear as standalone columns.
- All other columns (ID columns) are preserved and serve as the key that identifies each row in the wider output.
Common Pitfalls
- If you do not include proper ID columns, rows may be matched by position alone, producing unexpected pairings. Always ensure there is at least one column that uniquely identifies each entity (e.g., Subject, PatientID).
- Duplicate combinations of ID + Names From value require an aggregation function (mean, sum, first, last, count). Without one, the pivot is ambiguous and easyCris will warn you. Choose the appropriate aggregation for your data.
- Non-uniform group sizes (e.g., three Pre values but only two Post values for a subject) produce empty cells in the wider output. Review and handle these missing values before analysis.
- Column names derived from data values may contain spaces or special characters. easyCris preserves them as-is, but downstream tools may require renaming.
Example: Before & After
Before
| Subject | Time | Score |
|---|---|---|
| S1 | Pre | 80 |
| S1 | Post | 92 |
| S2 | Pre | 75 |
| S2 | Post | 88 |
→
After
| Subject | Pre | Post |
|---|---|---|
| S1 | 80 | 92 |
| S2 | 75 | 88 |
Names From: Time, Values From: Score. Each subject's two rows collapse into one row with separate Pre and Post columns.