ggplot2 - How can I color a line graph by grouping the variables in R? -
i have produced line graph looks this

i have data set of 50 countries , gdp last 10 years.
sample data:
country variable value china y2007 3.55218e+12 usa y2007 1.45000e+13 japan y2007 4.51526e+12 uk y2007 3.06301e+12 russia y2007 1.29971e+12 canada y2007 1.46498e+12 germany y2007 3.43995e+12 india y2007 1.20107e+12 france y2007 2.66311e+12 skorea y2007 1.12268e+12 i generated line graph using code
gdp_lineplot = ggplot(data=gdp_linechart, aes(x=variable,y=value)) + geom_line() + scale_y_continuous(name = "gdp(usd in trillions)", breaks = c(0.0e+00,5.0e+12,1.0e+13,1.5e+13), labels = c(0,5,10,15)) + scale_x_discrete(name = "years", labels = c(2007,"",2009,"",2011,"",2013,"",2015)) the idea make graph this.
i tried adding
group=country, color = country it outputs coloring countries.
how can color countries top 4 , rest?
ps: still naive r.
by plotting subsets, other groups aren't included in colour legend on right. alternative approach below manipulates factor levels , uses customized color scale overcome this.
preparing data
it assumed gdp_long contains data in long format. in line data shown op (gdp_lineplot, see data section below differences). manipulate factor levels, forcatspackage used (and data.table).
library(data.table) library(forcats) # coerce data.table, reorder factors values in last = actual year setdt(gdp_long)[, country := fct_reorder(country, -value, last)] # create new factor collapses countries "other" except top 4 countries gdp_long[, top_country := fct_other(country, keep = head(levels(country), 4))] create plot
library(ggplot2) ggplot(gdp_long, aes(year, value/1e12, group = country, colour = top_country)) + geom_point() + geom_line(size = 1) + theme_bw() + ylab("gdp(usd in trillions)") + scale_colour_manual(name = "country", values = c("green3", "orange", "blue", "red", "grey")) the chart quite similar expected result. lines of top 4 countries displayed in different colours while other countries displayed in grey appear in colour legend right.
note groupaesthetic still needed single line plotted each country while colour controlled levels of top_country.
data
the data set large reproduced here (even dput()). structure
str(gdp_long) 'data.frame': 1763 obs. of 3 variables: $ country: chr "afghanistan" "albania" "algeria" "andorra" ... $ year : int 2007 2007 2007 2007 2007 2007 2007 2007 2007 2007 ... $ value : num 9.84e+09 1.07e+10 1.35e+11 4.01e+09 6.04e+10 ... is similar op's data exception variable column converted integer column year. give nicely formatted x-axis without additional effort.

Comments
Post a Comment