This location contains the GDSC data set with 145 oncogene mutation statuses and ~1200 chemical descriptors. For instructions how to run binary classification (to predict compound activity vs inactivity) or regression (to predict values of log IC50), see the README.classification.txt and README.regression.txt files, respectively.