Efficient data analysis in R requires knowing how to filter rows, select specific columns, and arrange data in a meaningful order. The dplyr package makes these operations intuitive and fast.

1. Filtering Data with filter()

The filter() function is used to extract rows that meet specific conditions.

library(dplyr)data <- data.frame(
  Name = c("Alice", "Bob", "Charlie", "David"),
  Age = c(25, 30, 28, 22),
  Score = c(90, 85, 88, 95)
)# Filter rows where Age is greater than 25
filter(data, Age > 25)# Filter rows where Score is at least 90
filter(data, Score >= 90)

You can also combine multiple conditions using & (AND) and | (OR):

# Age > 25 AND Score >= 88
filter(data, Age > 25 & Score >= 88)

2. Selecting Columns with select()

The select() function is used to pick specific columns from a dataset.

# Select Name and Score columns
select(data, Name, Score)# Exclude a column
select(data, -Age)

You can also select columns using helpers like starts_with(), ends_with(), or contains():

# Select columns that start with "S"
select(data, starts_with("S"))

3. Arranging Data with arrange()

The arrange() function is used to reorder rows based on one or more columns.

# Arrange by Age ascending
arrange(data, Age)# Arrange by Score descending
arrange(data, desc(Score))# Arrange by multiple columns
arrange(data, Age, desc(Score))

4. Combining Filtering, Selecting, and Arranging

Using the pipe operator %>% from dplyr, you can chain operations together:

data %>%
  filter(Age > 25) %>%
  select(Name, Score) %>%
  arrange(desc(Score))

This filters rows where Age is greater than 25, selects only Name and Score columns, and sorts the results by Score in descending order.

5. Advantages of Using These Functions

Simplifies data exploration and cleaning
Makes code readable and maintainable
Efficiently handles large datasets
Integrates seamlessly with other dplyr and tidyverse functions

Conclusion

Filtering, selecting, and arranging data are core steps in data analysis. Mastering filter(), select(), and arrange() in R allows you to extract meaningful subsets of data, focus on relevant variables, and present your data in an organized and insightful way. Using the pipe operator %>% further streamlines these operations, making your workflow clean and efficient.

Home » R Programming (R Lang) > Data Manipulation in R > Filtering, Selecting, and Arranging Data

Free Video Tutorial

Want Mentorship on this Training?

Book a 1-on-1 Consultancy Session