Orange
Logo orange 1234 mediatheque lightbox The dataset is about customers of the French Telecom company Orange.
The goal is to predict the pr...
2017
13/11
 
  Partecipanti 41 Sottomissioni 638  
 

The dataset is about customers of the French Telecom company Orange.
The goal is to predict the propensity of customers to cancel their account (called churn). For privacy reasons, predictors are anonymized: you don’t know the meaning of any of the predictors.

This dataset is an opportunity to deal with a very large database, including heterogeneous noisy data (numerical and categorical predictors with missing values), and unbalanced class distribution for the response.

AUC = Area Under the Curve

train <- read.csv(“train.csv”)
test <- read.csv(“test.csv”)
test$churn = NA
n = nrow(train)
m = nrow(test)
combi = rbind(train,test)
train = combi[1:n,]
test = combi[(n+1):(n+m),]

require(rpart)
fit = rpart(churn==1 ~ . , data=train)
phat = predict(fit, newdata=test)

write.table(file=“mySubmission.txt”, phat, row.names = FALSE, col.names = FALSE)

Training set with n = 22253 observations
Test set with m = 27747 observations

RESPONSE VARIABLE
churn = -1 (no churn), +1 (churn)

PREDICTORS
Var1, Var2, …, Var230




train train.csv.zip
4 MB
test test.csv.zip
5 MB
Per partecipare bisogna prima autenticarsi
# Nome Punteggio Prove Ultima prova
1 Giovanni Barbarani FINALE 72.18% 32 18.06.2018
08:15
2 enricocartella FINALE 72.15% 57 31.10.2017
08:11
3 a.valsecchi20 FINALE 72.15% 1 02.11.2017
21:02
4 beatrice.santoro06 FINALE 72.15% 1 02.11.2017
09:30
5 e.zucca6 FINALE 71.64% 30 07.11.2017
20:23
6 a.pascali FINALE 71.64% 1 08.11.2017
21:49
7 davide.stenner FINALE 71.37% 141 09.11.2017
09:23
8 f.devecchi5 FINALE 71.37% 10 06.11.2017
19:42
9 fumagalliroberta94 FINALE 71.37% 2 07.11.2017
17:53
10 f.cordaro2 FINALE 71.30% 26 08.11.2017
10:55
11 f.roberti FINALE 71.30% 1 08.11.2017
20:06
12 g.maino2 FINALE 71.30% 1 09.11.2017
09:26
13 g.tornaghi1 FINALE 71.19% 45 07.11.2017
19:18
14 e.pasin FINALE 71.19% 1 08.11.2017
17:09
15 sonia_cucchi FINALE 71.19% 1 07.11.2017
10:55
16 m.mercandelli5 FINALE 71.13% 34 07.11.2017
14:32
17 g.ronco1 FINALE 71.13% 12 07.11.2017
10:39
18 m.ressico FINALE 71.10% 30 08.11.2017
14:04
19 m.fornaroli FINALE 71.10% 3 18.06.2018
10:51
20 t.comoglio FINALE 71.10% 1 08.11.2017
14:16
21 g.asti FINALE 71.08% 11 08.11.2017
23:36
22 s.offredi2 FINALE 71.08% 6 08.11.2017
15:14
23 m.cerliani FINALE 71.08% 9 08.11.2017
15:58
24 l.granata1 FINALE 71.08% 6 08.11.2017
16:55
25 fasolini.a50 FINALE 71.07% 6 08.11.2017
20:20
26 e.gabanelli FINALE 71.07% 3 08.11.2017
09:41
27 m.trabucchi1 FINALE 71.07% 3 06.11.2017
14:11
28 nicolo-p FINALE 71.07% 3 05.11.2017
21:15
29 d.lacaj FINALE 71.07% 1 06.11.2017
15:26
30 g.minniti2 FINALE 71.05% 20 15.06.2018
09:55
31 berruti.beatrice FINALE 70.84% 13 23.01.2018
00:18
32 m.antoniazzi1 FINALE 70.82% 13 09.11.2017
10:10
33 g.caccia3 FINALE 70.82% 6 09.11.2017
10:19
34 j.soppelsa FINALE 70.82% 6 19.01.2018
10:21
35 petrunistorica FINALE 70.69% 24 22.01.2018
11:57
36 Lorenzo Palloni FINALE 70.54% 7 17.06.2018
23:18
37 Michele De Vita FINALE 69.85% 43 17.06.2018
18:03
38 s.cirelli1 FINALE 67.99% 5 17.06.2018
23:16
39 ruud.gullit FINALE 65.79% 8 07.11.2017
13:31
40 solari.aldo FINALE 50.24% 12 08.06.2018
14:57
41 santo.picci FINALE 50.07% 3 10.06.2018
13:30