How to remove duplicates based on the combinations of two columns r
# Sample data frame
data <- data.frame(
col1 = c("A", "B", "C", "A", "B", "C"),
col2 = c(1, 2, 3, 1, 2, 3)
)
# Remove duplicates based on combinations of col1 and col2
unique_data <- data[!duplicated(data[c("col1", "col2")]), ]
Explanation:
1. Create a sample dataframe data
with columns col1
and col2
.
2. Use the duplicated()
function along with the subset of columns col1
and col2
within the data
dataframe to identify duplicate combinations of values in these columns.
3. Use the negation operator !
along with duplicated()
to get a logical vector indicating which rows are not duplicates based on the combinations of values in col1
and col2
.
4. Subset the data
dataframe using this logical vector to keep only the rows that are not duplicates, storing the result in unique_data
.