Overview

Dataset statistics

Number of variables32
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory14.8 MiB
Average record size in memory1.5 KiB

Variable types

CAT22
NUM7
BOOL2
DATE1

Reproduction

Analysis started2020-07-28 16:54:02.455401
Analysis finished2020-07-28 16:54:14.638727
Duration12.18 seconds
Versionpandas-profiling v2.8.0
Command linepandas_profiling --config_file config.yaml [YOUR_FILE.csv]
Download configurationconfig.yaml

Warnings

_FILE has constant value "dim_customer.csv" Constant
NAME_STYLE has constant value "False" Constant
FIRST_NAME has a high cardinality: 626 distinct values High cardinality
LAST_NAME has a high cardinality: 318 distinct values High cardinality
BIRTH_DATE has a high cardinality: 6196 distinct values High cardinality
ADDRESS_LINE_1 has a high cardinality: 8300 distinct values High cardinality
ADDRESS_LINE_2 has a high cardinality: 104 distinct values High cardinality
PHONE has a high cardinality: 4855 distinct values High cardinality
DATE_FIRST_PURCHASE has a high cardinality: 1097 distinct values High cardinality
BIRTH_DATE is uniformly distributed Uniform
ADDRESS_LINE_1 is uniformly distributed Uniform
_LINE has unique values Unique
CUSTOMER_KEY has unique values Unique
CUSTOMER_ALTERNATE_KEY has unique values Unique
EMAIL_ADDRESS has unique values Unique
TOTAL_CHILDREN has 2758 (27.6%) zeros Zeros
NUMBER_CHILDREN_AT_HOME has 6021 (60.2%) zeros Zeros
NUMBER_CARS_OWNED has 2354 (23.5%) zeros Zeros

Variables

_FILE
Categorical

CONSTANT
REJECTED

Distinct count1
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size78.2 KiB
dim_customer.csv
10000
ValueCountFrequency (%) 
dim_customer.csv10000100.0%
 
2020-07-28T16:54:14.771311image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Length

Max length16
Median length16
Mean length16
Min length16

Overview of Unicode Properties

Unique unicode characters13
Unique unicode categories (?)3
Unique unicode scripts (?)2
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
m2000012.5%
 
c2000012.5%
 
s2000012.5%
 
d100006.2%
 
i100006.2%
 
_100006.2%
 
u100006.2%
 
t100006.2%
 
o100006.2%
 
e100006.2%
 
r100006.2%
 
.100006.2%
 
v100006.2%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter14000087.5%
 
Connector Punctuation100006.2%
 
Other Punctuation100006.2%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
m2000014.3%
 
c2000014.3%
 
s2000014.3%
 
d100007.1%
 
i100007.1%
 
u100007.1%
 
t100007.1%
 
o100007.1%
 
e100007.1%
 
r100007.1%
 
v100007.1%
 

Most frequent Connector Punctuation characters

ValueCountFrequency (%) 
_10000100.0%
 

Most frequent Other Punctuation characters

ValueCountFrequency (%) 
.10000100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin14000087.5%
 
Common2000012.5%
 

Most frequent Latin characters

ValueCountFrequency (%) 
m2000014.3%
 
c2000014.3%
 
s2000014.3%
 
d100007.1%
 
i100007.1%
 
u100007.1%
 
t100007.1%
 
o100007.1%
 
e100007.1%
 
r100007.1%
 
v100007.1%
 

Most frequent Common characters

ValueCountFrequency (%) 
_1000050.0%
 
.1000050.0%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII160000100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
m2000012.5%
 
c2000012.5%
 
s2000012.5%
 
d100006.2%
 
i100006.2%
 
_100006.2%
 
u100006.2%
 
t100006.2%
 
o100006.2%
 
e100006.2%
 
r100006.2%
 
.100006.2%
 
v100006.2%
 

_LINE
Real number (ℝ≥0)

UNIQUE

Distinct count10000
Unique (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean12313.6977
Minimum0
Maximum18483
Zeros1
Zeros (%)< 0.1%
Memory size19.7 KiB
2020-07-28T16:54:14.935873image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1345.95
Q110983.75
median13483.5
Q315983.25
95-th percentile17983.05
Maximum18483
Range18483
Interquartile range (IQR)4999.5

Descriptive statistics

Standard deviation5093.443851
Coefficient of variation (CV)0.4136404819
Kurtosis0.3780766517
Mean12313.6977
Median Absolute Deviation (MAD)2500
Skewness-1.185599209
Sum123136977
Variance25943170.26
2020-07-28T16:54:15.094743image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
184311< 0.1%
 
115831< 0.1%
 
13541< 0.1%
 
115911< 0.1%
 
180231< 0.1%
 
156851< 0.1%
 
136361< 0.1%
 
180501< 0.1%
 
13461< 0.1%
 
156771< 0.1%
 
156931< 0.1%
 
136281< 0.1%
 
13381< 0.1%
 
115751< 0.1%
 
181681< 0.1%
 
156691< 0.1%
 
136201< 0.1%
 
13301< 0.1%
 
136441< 0.1%
 
19701< 0.1%
 
117671< 0.1%
 
13781< 0.1%
 
136761< 0.1%
 
165361< 0.1%
 
13861< 0.1%
 
Other values (9975)997599.8%
 
ValueCountFrequency (%) 
01< 0.1%
 
11< 0.1%
 
21< 0.1%
 
101< 0.1%
 
111< 0.1%
 
121< 0.1%
 
131< 0.1%
 
141< 0.1%
 
151< 0.1%
 
161< 0.1%
 
ValueCountFrequency (%) 
184831< 0.1%
 
184821< 0.1%
 
184811< 0.1%
 
184801< 0.1%
 
184791< 0.1%
 
184781< 0.1%
 
184771< 0.1%
 
184761< 0.1%
 
184751< 0.1%
 
184741< 0.1%
 

CUSTOMER_KEY
Real number (ℝ≥0)

UNIQUE

Distinct count10000
Unique (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20241.5513
Minimum11000
Maximum29388
Zeros0
Zeros (%)0.0%
Memory size19.7 KiB
2020-07-28T16:54:15.401242image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Quantile statistics

Minimum11000
5-th percentile12151.95
Q115132.75
median20715.5
Q324746.25
95-th percentile28201.05
Maximum29388
Range18388
Interquartile range (IQR)9613.5

Descriptive statistics

Standard deviation5211.229748
Coefficient of variation (CV)0.257452093
Kurtosis-1.172990799
Mean20241.5513
Median Absolute Deviation (MAD)4334
Skewness-0.0528301809
Sum202415513
Variance27156915.49
2020-07-28T16:54:15.567149image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
184311< 0.1%
 
134041< 0.1%
 
134201< 0.1%
 
195631< 0.1%
 
175141< 0.1%
 
236571< 0.1%
 
236491< 0.1%
 
256941< 0.1%
 
292761< 0.1%
 
174981< 0.1%
 
175221< 0.1%
 
256861< 0.1%
 
133961< 0.1%
 
174901< 0.1%
 
236331< 0.1%
 
277271< 0.1%
 
256781< 0.1%
 
133881< 0.1%
 
133631< 0.1%
 
195711< 0.1%
 
136201< 0.1%
 
175461< 0.1%
 
175621< 0.1%
 
237051< 0.1%
 
134601< 0.1%
 
Other values (9975)997599.8%
 
ValueCountFrequency (%) 
110001< 0.1%
 
110021< 0.1%
 
110061< 0.1%
 
110151< 0.1%
 
110191< 0.1%
 
110231< 0.1%
 
110241< 0.1%
 
110251< 0.1%
 
110261< 0.1%
 
110271< 0.1%
 
ValueCountFrequency (%) 
293881< 0.1%
 
293741< 0.1%
 
293731< 0.1%
 
293721< 0.1%
 
293691< 0.1%
 
293681< 0.1%
 
293671< 0.1%
 
293661< 0.1%
 
293651< 0.1%
 
293631< 0.1%
 

GEOGRAPHY_KEY
Real number (ℝ≥0)

Distinct count314
Unique (%)3.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean255.531
Minimum2
Maximum653
Zeros0
Zeros (%)0.0%
Memory size19.7 KiB
2020-07-28T16:54:15.743939image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile13
Q166
median237
Q3339
95-th percentile631
Maximum653
Range651
Interquartile range (IQR)273

Descriptive statistics

Standard deviation192.6080004
Coefficient of variation (CV)0.753755906
Kurtosis-0.5996480352
Mean255.531
Median Absolute Deviation (MAD)118
Skewness0.5855306408
Sum2555310
Variance37097.84182
2020-07-28T16:54:15.907264image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
3111291.3%
 
6121101.1%
 
2991071.1%
 
2981061.1%
 
6091061.1%
 
491041.0%
 
3021031.0%
 
3071011.0%
 
5361011.0%
 
301930.9%
 
300930.9%
 
611920.9%
 
310820.8%
 
312710.7%
 
316630.6%
 
539620.6%
 
543610.6%
 
315610.6%
 
343610.6%
 
51610.6%
 
20610.6%
 
4600.6%
 
335600.6%
 
25600.6%
 
32590.6%
 
Other values (289)793379.3%
 
ValueCountFrequency (%) 
2580.6%
 
3410.4%
 
4600.6%
 
5390.4%
 
6480.5%
 
7380.4%
 
8440.4%
 
9410.4%
 
10340.3%
 
11420.4%
 
ValueCountFrequency (%) 
6531< 0.1%
 
648480.5%
 
644540.5%
 
642470.5%
 
641410.4%
 
638490.5%
 
637470.5%
 
635500.5%
 
634460.5%
 
633570.6%
 

CUSTOMER_ALTERNATE_KEY
Categorical

UNIQUE

Distinct count10000
Unique (%)100.0%
Missing0
Missing (%)0.0%
Memory size78.2 KiB
AW00017662
 
1
AW00020303
 
1
AW00022525
 
1
AW00022628
 
1
AW00016404
 
1
Other values (9995)
9995
ValueCountFrequency (%) 
AW000176621< 0.1%
 
AW000203031< 0.1%
 
AW000225251< 0.1%
 
AW000226281< 0.1%
 
AW000164041< 0.1%
 
AW000170031< 0.1%
 
AW000233141< 0.1%
 
AW000125441< 0.1%
 
AW000162621< 0.1%
 
AW000269261< 0.1%
 
AW000171141< 0.1%
 
AW000289041< 0.1%
 
AW000200241< 0.1%
 
AW000206471< 0.1%
 
AW000237861< 0.1%
 
AW000225191< 0.1%
 
AW000292381< 0.1%
 
AW000146961< 0.1%
 
AW000205451< 0.1%
 
AW000274511< 0.1%
 
AW000127961< 0.1%
 
AW000212581< 0.1%
 
AW000121021< 0.1%
 
AW000172311< 0.1%
 
AW000255121< 0.1%
 
Other values (9975)997599.8%
 
2020-07-28T16:54:16.174691image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Length

Max length10
Median length10
Mean length10
Min length10

Overview of Unicode Properties

Unique unicode characters12
Unique unicode categories (?)2
Unique unicode scripts (?)2
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
03403034.0%
 
21018610.2%
 
A1000010.0%
 
W1000010.0%
 
183808.4%
 
442904.3%
 
741614.2%
 
340414.0%
 
539383.9%
 
638673.9%
 
836803.7%
 
934273.4%
 

Most occurring categories

ValueCountFrequency (%) 
Decimal Number8000080.0%
 
Uppercase Letter2000020.0%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
A1000050.0%
 
W1000050.0%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
03403042.5%
 
21018612.7%
 
1838010.5%
 
442905.4%
 
741615.2%
 
340415.1%
 
539384.9%
 
638674.8%
 
836804.6%
 
934274.3%
 

Most occurring scripts

ValueCountFrequency (%) 
Common8000080.0%
 
Latin2000020.0%
 

Most frequent Latin characters

ValueCountFrequency (%) 
A1000050.0%
 
W1000050.0%
 

Most frequent Common characters

ValueCountFrequency (%) 
03403042.5%
 
21018612.7%
 
1838010.5%
 
442905.4%
 
741615.2%
 
340415.1%
 
539384.9%
 
638674.8%
 
836804.6%
 
934274.3%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII100000100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
03403034.0%
 
21018610.2%
 
A1000010.0%
 
W1000010.0%
 
183808.4%
 
442904.3%
 
741614.2%
 
340414.0%
 
539383.9%
 
638673.9%
 
836803.7%
 
934273.4%
 

TITLE
Categorical

Distinct count5
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size78.2 KiB
null
9946
Mr.
 
30
Ms.
 
20
Sr.
 
3
Ms
 
1
ValueCountFrequency (%) 
null994699.5%
 
Mr.300.3%
 
Ms.200.2%
 
Sr.3< 0.1%
 
Ms1< 0.1%
 
2020-07-28T16:54:16.425962image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Length

Max length4
Median length4
Mean length3.9945
Min length2

Overview of Unicode Properties

Unique unicode characters8
Unique unicode categories (?)3
Unique unicode scripts (?)2
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
l1989249.8%
 
n994624.9%
 
u994624.9%
 
.530.1%
 
M510.1%
 
r330.1%
 
s210.1%
 
S3< 0.1%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter3983899.7%
 
Uppercase Letter540.1%
 
Other Punctuation530.1%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
l1989249.9%
 
n994625.0%
 
u994625.0%
 
r330.1%
 
s210.1%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
M5194.4%
 
S35.6%
 

Most frequent Other Punctuation characters

ValueCountFrequency (%) 
.53100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin3989299.9%
 
Common530.1%
 

Most frequent Latin characters

ValueCountFrequency (%) 
l1989249.9%
 
n994624.9%
 
u994624.9%
 
M510.1%
 
r330.1%
 
s210.1%
 
S3< 0.1%
 

Most frequent Common characters

ValueCountFrequency (%) 
.53100.0%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII39945100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
l1989249.8%
 
n994624.9%
 
u994624.9%
 
.530.1%
 
M510.1%
 
r330.1%
 
s210.1%
 
S3< 0.1%
 

FIRST_NAME
Categorical

HIGH CARDINALITY

Distinct count626
Unique (%)6.3%
Missing0
Missing (%)0.0%
Memory size78.2 KiB
Isabella
 
52
Marcus
 
52
Julia
 
52
Seth
 
51
Xavier
 
51
Other values (621)
9742
ValueCountFrequency (%) 
Isabella520.5%
 
Marcus520.5%
 
Julia520.5%
 
Seth510.5%
 
Xavier510.5%
 
Eduardo500.5%
 
Sydney500.5%
 
Kaitlyn500.5%
 
Natalie490.5%
 
Chloe480.5%
 
Rachel470.5%
 
Lucas470.5%
 
Katherine470.5%
 
Devin450.4%
 
Amanda450.4%
 
Jonathan450.4%
 
Olivia440.4%
 
Alexandra440.4%
 
Richard440.4%
 
James440.4%
 
Dalton430.4%
 
Charles430.4%
 
Morgan430.4%
 
Wyatt430.4%
 
Jennifer420.4%
 
Other values (601)882988.3%
 
2020-07-28T16:54:16.681083image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Length

Max length11
Median length6
Mean length5.9349
Min length2

Overview of Unicode Properties

Unique unicode characters56
Unique unicode categories (?)5
Unique unicode scripts (?)2
Unique unicode blocks (?)2
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
a790313.3%
 
e598710.1%
 
n51018.6%
 
i44647.5%
 
r42047.1%
 
l35766.0%
 
o25424.3%
 
s21413.6%
 
t20443.4%
 
y20063.4%
 
h18133.1%
 
d15602.6%
 
c12872.2%
 
J12412.1%
 
A10521.8%
 
u10321.7%
 
m9041.5%
 
C8831.5%
 
M8761.5%
 
K6441.1%
 
S6101.0%
 
b6081.0%
 
D6071.0%
 
R5741.0%
 
g4880.8%
 
Other values (31)52028.8%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter4934483.1%
 
Uppercase Letter1000216.9%
 
Dash Punctuation1< 0.1%
 
Space Separator1< 0.1%
 
Other Punctuation1< 0.1%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
J124112.4%
 
A105210.5%
 
C8838.8%
 
M8768.8%
 
K6446.4%
 
S6106.1%
 
D6076.1%
 
R5745.7%
 
B4704.7%
 
E4554.5%
 
T4424.4%
 
L4294.3%
 
G2973.0%
 
N2822.8%
 
H2082.1%
 
I1691.7%
 
P1631.6%
 
W1631.6%
 
F1361.4%
 
V1091.1%
 
O920.9%
 
X510.5%
 
Z410.4%
 
Y80.1%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
a790316.0%
 
e598712.1%
 
n510110.3%
 
i44649.0%
 
r42048.5%
 
l35767.2%
 
o25425.2%
 
s21414.3%
 
t20444.1%
 
y20064.1%
 
h18133.7%
 
d15603.2%
 
c12872.6%
 
u10322.1%
 
m9041.8%
 
b6081.2%
 
g4881.0%
 
v4420.9%
 
k3540.7%
 
w1890.4%
 
f1890.4%
 
x1610.3%
 
p1220.2%
 
j770.2%
 
z660.1%
 
Other values (4)840.2%
 

Most frequent Dash Punctuation characters

ValueCountFrequency (%) 
-1100.0%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
1100.0%
 

Most frequent Other Punctuation characters

ValueCountFrequency (%) 
.1100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin59346> 99.9%
 
Common3< 0.1%
 

Most frequent Latin characters

ValueCountFrequency (%) 
a790313.3%
 
e598710.1%
 
n51018.6%
 
i44647.5%
 
r42047.1%
 
l35766.0%
 
o25424.3%
 
s21413.6%
 
t20443.4%
 
y20063.4%
 
h18133.1%
 
d15602.6%
 
c12872.2%
 
J12412.1%
 
A10521.8%
 
u10321.7%
 
m9041.5%
 
C8831.5%
 
M8761.5%
 
K6441.1%
 
S6101.0%
 
b6081.0%
 
D6071.0%
 
R5741.0%
 
g4880.8%
 
Other values (28)51998.8%
 

Most frequent Common characters

ValueCountFrequency (%) 
-133.3%
 
133.3%
 
.133.3%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII59328> 99.9%
 
None21< 0.1%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
a790313.3%
 
e598710.1%
 
n51018.6%
 
i44647.5%
 
r42047.1%
 
l35766.0%
 
o25424.3%
 
s21413.6%
 
t20443.4%
 
y20063.4%
 
h18133.1%
 
d15602.6%
 
c12872.2%
 
J12412.1%
 
A10521.8%
 
u10321.7%
 
m9041.5%
 
C8831.5%
 
M8761.5%
 
K6441.1%
 
S6101.0%
 
b6081.0%
 
D6071.0%
 
R5741.0%
 
g4880.8%
 
Other values (28)51818.7%
 

Most frequent None characters

ValueCountFrequency (%) 
é1990.5%
 
í14.8%
 
ñ14.8%
 

MIDDLE_NAME
Categorical

Distinct count40
Unique (%)0.4%
Missing0
Missing (%)0.0%
Memory size78.2 KiB
null
4213
L
 
696
A
 
684
M
 
619
C
 
521
Other values (35)
3267
ValueCountFrequency (%) 
null421342.1%
 
L6967.0%
 
A6846.8%
 
M6196.2%
 
C5215.2%
 
J5175.2%
 
E3873.9%
 
R3683.7%
 
D3003.0%
 
S2522.5%
 
K1992.0%
 
W1911.9%
 
G1541.5%
 
B1451.5%
 
H1321.3%
 
T1271.3%
 
P1151.1%
 
F1131.1%
 
V760.8%
 
N530.5%
 
I520.5%
 
O350.4%
 
Y100.1%
 
Z100.1%
 
Q50.1%
 
Other values (15)260.3%
 
2020-07-28T16:54:16.925704image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Length

Max length6
Median length1
Mean length2.2668
Min length1

Overview of Unicode Properties

Unique unicode characters35
Unique unicode categories (?)3
Unique unicode scripts (?)2
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
l842937.2%
 
n421418.6%
 
u421418.6%
 
L6973.1%
 
A6853.0%
 
M6222.7%
 
J5232.3%
 
C5212.3%
 
E3871.7%
 
R3731.6%
 
D3001.3%
 
S2521.1%
 
K2000.9%
 
W1910.8%
 
G1550.7%
 
B1460.6%
 
H1330.6%
 
T1270.6%
 
P1150.5%
 
F1140.5%
 
V760.3%
 
N540.2%
 
I520.2%
 
O350.2%
 
.190.1%
 
Other values (10)340.1%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter1686574.4%
 
Uppercase Letter578425.5%
 
Other Punctuation190.1%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
L69712.1%
 
A68511.8%
 
M62210.8%
 
J5239.0%
 
C5219.0%
 
E3876.7%
 
R3736.4%
 
D3005.2%
 
S2524.4%
 
K2003.5%
 
W1913.3%
 
G1552.7%
 
B1462.5%
 
H1332.3%
 
T1272.2%
 
P1152.0%
 
F1142.0%
 
V761.3%
 
N540.9%
 
I520.9%
 
O350.6%
 
Z100.2%
 
Y100.2%
 
Q50.1%
 
X1< 0.1%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
l842950.0%
 
n421425.0%
 
u421425.0%
 
a2< 0.1%
 
r2< 0.1%
 
d1< 0.1%
 
i1< 0.1%
 
e1< 0.1%
 
o1< 0.1%
 

Most frequent Other Punctuation characters

ValueCountFrequency (%) 
.19100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin2264999.9%
 
Common190.1%
 

Most frequent Latin characters

ValueCountFrequency (%) 
l842937.2%
 
n421418.6%
 
u421418.6%
 
L6973.1%
 
A6853.0%
 
M6222.7%
 
J5232.3%
 
C5212.3%
 
E3871.7%
 
R3731.6%
 
D3001.3%
 
S2521.1%
 
K2000.9%
 
W1910.8%
 
G1550.7%
 
B1460.6%
 
H1330.6%
 
T1270.6%
 
P1150.5%
 
F1140.5%
 
V760.3%
 
N540.2%
 
I520.2%
 
O350.2%
 
Z10< 0.1%
 
Other values (9)240.1%
 

Most frequent Common characters

ValueCountFrequency (%) 
.19100.0%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII22668100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
l842937.2%
 
n421418.6%
 
u421418.6%
 
L6973.1%
 
A6853.0%
 
M6222.7%
 
J5232.3%
 
C5212.3%
 
E3871.7%
 
R3731.6%
 
D3001.3%
 
S2521.1%
 
K2000.9%
 
W1910.8%
 
G1550.7%
 
B1460.6%
 
H1330.6%
 
T1270.6%
 
P1150.5%
 
F1140.5%
 
V760.3%
 
N540.2%
 
I520.2%
 
O350.2%
 
.190.1%
 
Other values (10)340.1%
 

LAST_NAME
Categorical

HIGH CARDINALITY

Distinct count318
Unique (%)3.2%
Missing0
Missing (%)0.0%
Memory size78.2 KiB
Diaz
 
118
Martin
 
101
Hernandez
 
100
Sanchez
 
97
Xu
 
94
Other values (313)
9490
ValueCountFrequency (%) 
Diaz1181.2%
 
Martin1011.0%
 
Hernandez1001.0%
 
Sanchez971.0%
 
Xu940.9%
 
Torres930.9%
 
Martinez910.9%
 
Lopez890.9%
 
Perez880.9%
 
Rodriguez870.9%
 
Gonzalez800.8%
 
Garcia790.8%
 
Shan770.8%
 
Kumar750.8%
 
Jai740.7%
 
Perry740.7%
 
Hughes720.7%
 
Russell700.7%
 
Lal690.7%
 
Washington680.7%
 
Ross680.7%
 
Patterson660.7%
 
Butler660.7%
 
Carlson660.7%
 
Romero650.7%
 
Other values (293)797379.7%
 
2020-07-28T16:54:17.185796image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Length

Max length16
Median length6
Mean length5.5231
Min length2

Overview of Unicode Properties

Unique unicode characters58
Unique unicode categories (?)4
Unique unicode scripts (?)2
Unique unicode blocks (?)2
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
a564210.2%
 
e552910.0%
 
r47008.5%
 
n45728.3%
 
o39857.2%
 
i28005.1%
 
s25484.6%
 
l25304.6%
 
z16272.9%
 
u16272.9%
 
t15792.9%
 
h15342.8%
 
d12312.2%
 
S10952.0%
 
g10912.0%
 
m10832.0%
 
R10681.9%
 
M7761.4%
 
C7021.3%
 
G6481.2%
 
H6051.1%
 
P5881.1%
 
L5771.0%
 
c5711.0%
 
W5391.0%
 
Other values (33)598410.8%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter4521681.9%
 
Uppercase Letter1001118.1%
 
Space Separator3< 0.1%
 
Dash Punctuation1< 0.1%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
S109510.9%
 
R106810.7%
 
M7767.8%
 
C7027.0%
 
G6486.5%
 
H6056.0%
 
P5885.9%
 
L5775.8%
 
W5395.4%
 
B5295.3%
 
A4614.6%
 
J3773.8%
 
T3303.3%
 
D2892.9%
 
Z2282.3%
 
K2232.2%
 
Y2032.0%
 
N2022.0%
 
F1841.8%
 
X1531.5%
 
V971.0%
 
E890.9%
 
O450.4%
 
U3< 0.1%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
a564212.5%
 
e552912.2%
 
r470010.4%
 
n457210.1%
 
o39858.8%
 
i28006.2%
 
s25485.6%
 
l25305.6%
 
z16273.6%
 
u16273.6%
 
t15793.5%
 
h15343.4%
 
d12312.7%
 
g10912.4%
 
m10832.4%
 
c5711.3%
 
k4501.0%
 
y4481.0%
 
p3640.8%
 
w3320.7%
 
v2960.7%
 
b2420.5%
 
x1180.3%
 
f1120.2%
 
j1110.2%
 
Other values (7)940.2%
 

Most frequent Dash Punctuation characters

ValueCountFrequency (%) 
-1100.0%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
3100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin55227> 99.9%
 
Common4< 0.1%
 

Most frequent Latin characters

ValueCountFrequency (%) 
a564210.2%
 
e552910.0%
 
r47008.5%
 
n45728.3%
 
o39857.2%
 
i28005.1%
 
s25484.6%
 
l25304.6%
 
z16272.9%
 
u16272.9%
 
t15792.9%
 
h15342.8%
 
d12312.2%
 
S10952.0%
 
g10912.0%
 
m10832.0%
 
R10681.9%
 
M7761.4%
 
C7021.3%
 
G6481.2%
 
H6051.1%
 
P5881.1%
 
L5771.0%
 
c5711.0%
 
W5391.0%
 
Other values (31)598010.8%
 

Most frequent Common characters

ValueCountFrequency (%) 
375.0%
 
-125.0%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII5520299.9%
 
None290.1%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
a564210.2%
 
e552910.0%
 
r47008.5%
 
n45728.3%
 
o39857.2%
 
i28005.1%
 
s25484.6%
 
l25304.6%
 
z16272.9%
 
u16272.9%
 
t15792.9%
 
h15342.8%
 
d12312.2%
 
S10952.0%
 
g10912.0%
 
m10832.0%
 
R10681.9%
 
M7761.4%
 
C7021.3%
 
G6481.2%
 
H6051.1%
 
P5881.1%
 
L5771.0%
 
c5711.0%
 
W5391.0%
 
Other values (27)595510.8%
 

Most frequent None characters

ValueCountFrequency (%) 
é1965.5%
 
á517.2%
 
ñ26.9%
 
ø13.4%
 
ó13.4%
 
ã13.4%
 

NAME_STYLE
Boolean

CONSTANT
REJECTED

Distinct count1
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size9.9 KiB
False
10000
ValueCountFrequency (%) 
False10000100.0%
 

BIRTH_DATE
Categorical

HIGH CARDINALITY
UNIFORM

Distinct count6196
Unique (%)62.0%
Missing0
Missing (%)0.0%
Memory size78.2 KiB
1967-06-11T00:00:00.0000000
 
7
1972-03-13T00:00:00.0000000
 
7
1973-08-06T00:00:00.0000000
 
6
1963-09-03T00:00:00.0000000
 
6
1966-07-26T00:00:00.0000000
 
6
Other values (6191)
9968
ValueCountFrequency (%) 
1967-06-11T00:00:00.000000070.1%
 
1972-03-13T00:00:00.000000070.1%
 
1973-08-06T00:00:00.000000060.1%
 
1963-09-03T00:00:00.000000060.1%
 
1966-07-26T00:00:00.000000060.1%
 
1970-03-05T00:00:00.000000060.1%
 
1962-05-04T00:00:00.000000060.1%
 
1972-01-15T00:00:00.000000060.1%
 
1960-05-14T00:00:00.000000060.1%
 
1955-09-23T00:00:00.000000060.1%
 
1952-07-11T00:00:00.000000060.1%
 
1964-05-12T00:00:00.000000060.1%
 
1961-02-04T00:00:00.000000060.1%
 
1960-05-22T00:00:00.000000060.1%
 
1965-10-04T00:00:00.000000060.1%
 
1979-08-20T00:00:00.000000060.1%
 
1962-06-24T00:00:00.000000060.1%
 
1971-06-15T00:00:00.000000050.1%
 
1957-07-06T00:00:00.000000050.1%
 
1960-07-27T00:00:00.000000050.1%
 
1979-08-23T00:00:00.000000050.1%
 
1964-07-14T00:00:00.000000050.1%
 
1965-06-23T00:00:00.000000050.1%
 
1962-04-02T00:00:00.000000050.1%
 
1965-08-27T00:00:00.000000050.1%
 
Other values (6171)985698.6%
 
2020-07-28T16:54:17.456506image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Length

Max length27
Median length27
Mean length27
Min length27

Overview of Unicode Properties

Unique unicode characters14
Unique unicode categories (?)4
Unique unicode scripts (?)2
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
014357553.2%
 
-200007.4%
 
:200007.4%
 
1192327.1%
 
9126444.7%
 
T100003.7%
 
.100003.7%
 
266412.5%
 
664472.4%
 
754972.0%
 
552501.9%
 
442391.6%
 
334631.3%
 
830121.1%
 

Most occurring categories

ValueCountFrequency (%) 
Decimal Number21000077.8%
 
Other Punctuation3000011.1%
 
Dash Punctuation200007.4%
 
Uppercase Letter100003.7%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
014357568.4%
 
1192329.2%
 
9126446.0%
 
266413.2%
 
664473.1%
 
754972.6%
 
552502.5%
 
442392.0%
 
334631.6%
 
830121.4%
 

Most frequent Dash Punctuation characters

ValueCountFrequency (%) 
-20000100.0%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
T10000100.0%
 

Most frequent Other Punctuation characters

ValueCountFrequency (%) 
:2000066.7%
 
.1000033.3%
 

Most occurring scripts

ValueCountFrequency (%) 
Common26000096.3%
 
Latin100003.7%
 

Most frequent Common characters

ValueCountFrequency (%) 
014357555.2%
 
-200007.7%
 
:200007.7%
 
1192327.4%
 
9126444.9%
 
.100003.8%
 
266412.6%
 
664472.5%
 
754972.1%
 
552502.0%
 
442391.6%
 
334631.3%
 
830121.2%
 

Most frequent Latin characters

ValueCountFrequency (%) 
T10000100.0%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII270000100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
014357553.2%
 
-200007.4%
 
:200007.4%
 
1192327.1%
 
9126444.7%
 
T100003.7%
 
.100003.7%
 
266412.5%
 
664472.4%
 
754972.0%
 
552501.9%
 
442391.6%
 
334631.3%
 
830121.1%
 

MARITAL_STATUS
Categorical

Distinct count2
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size78.2 KiB
M
5437
S
4563
ValueCountFrequency (%) 
M543754.4%
 
S456345.6%
 
2020-07-28T16:54:17.696333image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Length

Max length1
Median length1
Mean length1
Min length1

Overview of Unicode Properties

Unique unicode characters2
Unique unicode categories (?)1
Unique unicode scripts (?)1
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
M543754.4%
 
S456345.6%
 

Most occurring categories

ValueCountFrequency (%) 
Uppercase Letter10000100.0%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
M543754.4%
 
S456345.6%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin10000100.0%
 

Most frequent Latin characters

ValueCountFrequency (%) 
M543754.4%
 
S456345.6%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII10000100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
M543754.4%
 
S456345.6%
 

SUFFIX
Categorical

Distinct count2
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size78.2 KiB
null
9999
Jr.
 
1
ValueCountFrequency (%) 
null9999> 99.9%
 
Jr.1< 0.1%
 
2020-07-28T16:54:17.935164image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Length

Max length4
Median length4
Mean length3.9999
Min length3

Overview of Unicode Properties

Unique unicode characters6
Unique unicode categories (?)3
Unique unicode scripts (?)2
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
l1999850.0%
 
n999925.0%
 
u999925.0%
 
J1< 0.1%
 
r1< 0.1%
 
.1< 0.1%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter39997> 99.9%
 
Uppercase Letter1< 0.1%
 
Other Punctuation1< 0.1%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
l1999850.0%
 
n999925.0%
 
u999925.0%
 
r1< 0.1%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
J1100.0%
 

Most frequent Other Punctuation characters

ValueCountFrequency (%) 
.1100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin39998> 99.9%
 
Common1< 0.1%
 

Most frequent Latin characters

ValueCountFrequency (%) 
l1999850.0%
 
n999925.0%
 
u999925.0%
 
J1< 0.1%
 
r1< 0.1%
 

Most frequent Common characters

ValueCountFrequency (%) 
.1100.0%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII39999100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
l1999850.0%
 
n999925.0%
 
u999925.0%
 
J1< 0.1%
 
r1< 0.1%
 
.1< 0.1%
 

GENDER
Categorical

Distinct count2
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size78.2 KiB
M
5112
F
4888
ValueCountFrequency (%) 
M511251.1%
 
F488848.9%
 
2020-07-28T16:54:18.175907image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Length

Max length1
Median length1
Mean length1
Min length1

Overview of Unicode Properties

Unique unicode characters2
Unique unicode categories (?)1
Unique unicode scripts (?)1
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
M511251.1%
 
F488848.9%
 

Most occurring categories

ValueCountFrequency (%) 
Uppercase Letter10000100.0%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
M511251.1%
 
F488848.9%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin10000100.0%
 

Most frequent Latin characters

ValueCountFrequency (%) 
M511251.1%
 
F488848.9%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII10000100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
M511251.1%
 
F488848.9%
 

EMAIL_ADDRESS
Categorical

UNIQUE

Distinct count10000
Unique (%)100.0%
Missing0
Missing (%)0.0%
Memory size78.2 KiB
2020-07-28T16:54:18.438746image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Length

Max length33
Median length28
Mean length27.6572
Min length22

Overview of Unicode Properties

Unique unicode characters42
Unique unicode categories (?)4
Unique unicode scripts (?)2
Unique unicode blocks (?)2
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
e264429.6%
 
r247789.0%
 
o226348.2%
 
a189556.9%
 
n153835.6%
 
s127514.6%
 
t124864.5%
 
c121704.4%
 
d121674.4%
 
m117804.3%
 
u110324.0%
 
k109984.0%
 
v105513.8%
 
w103523.7%
 
-100013.6%
 
@100003.6%
 
.100003.6%
 
i46331.7%
 
140281.5%
 
l40051.4%
 
226751.0%
 
h20210.7%
 
y20140.7%
 
319920.7%
 
417250.6%
 
Other values (17)109994.0%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter22934682.9%
 
Other Punctuation200007.2%
 
Decimal Number172256.2%
 
Dash Punctuation100013.6%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
e2644211.5%
 
r2477810.8%
 
o226349.9%
 
a189558.3%
 
n153836.7%
 
s127515.6%
 
t124865.4%
 
c121705.3%
 
d121675.3%
 
m117805.1%
 
u110324.8%
 
k109984.8%
 
v105514.6%
 
w103524.5%
 
i46332.0%
 
l40051.7%
 
h20210.9%
 
y20140.9%
 
j13180.6%
 
b10780.5%
 
g7850.3%
 
f3250.1%
 
p2850.1%
 
x2120.1%
 
z107< 0.1%
 
Other values (4)84< 0.1%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
1402823.4%
 
2267515.5%
 
3199211.6%
 
4172510.0%
 
513758.0%
 
612347.2%
 
011076.4%
 
710936.3%
 
810616.2%
 
99355.4%
 

Most frequent Other Punctuation characters

ValueCountFrequency (%) 
@1000050.0%
 
.1000050.0%
 

Most frequent Dash Punctuation characters

ValueCountFrequency (%) 
-10001100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin22934682.9%
 
Common4722617.1%
 

Most frequent Latin characters

ValueCountFrequency (%) 
e2644211.5%
 
r2477810.8%
 
o226349.9%
 
a189558.3%
 
n153836.7%
 
s127515.6%
 
t124865.4%
 
c121705.3%
 
d121675.3%
 
m117805.1%
 
u110324.8%
 
k109984.8%
 
v105514.6%
 
w103524.5%
 
i46332.0%
 
l40051.7%
 
h20210.9%
 
y20140.9%
 
j13180.6%
 
b10780.5%
 
g7850.3%
 
f3250.1%
 
p2850.1%
 
x2120.1%
 
z107< 0.1%
 
Other values (4)84< 0.1%
 

Most frequent Common characters

ValueCountFrequency (%) 
-1000121.2%
 
@1000021.2%
 
.1000021.2%
 
140288.5%
 
226755.7%
 
319924.2%
 
417253.7%
 
513752.9%
 
612342.6%
 
011072.3%
 
710932.3%
 
810612.2%
 
99352.0%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII276551> 99.9%
 
None21< 0.1%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
e264429.6%
 
r247789.0%
 
o226348.2%
 
a189556.9%
 
n153835.6%
 
s127514.6%
 
t124864.5%
 
c121704.4%
 
d121674.4%
 
m117804.3%
 
u110324.0%
 
k109984.0%
 
v105513.8%
 
w103523.7%
 
-100013.6%
 
@100003.6%
 
.100003.6%
 
i46331.7%
 
140281.5%
 
l40051.4%
 
226751.0%
 
h20210.7%
 
y20140.7%
 
319920.7%
 
417250.6%
 
Other values (14)109784.0%
 

Most frequent None characters

ValueCountFrequency (%) 
é1990.5%
 
í14.8%
 
ñ14.8%
 

YEARLY_INCOME
Real number (ℝ≥0)

Distinct count16
Unique (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean56410.0
Minimum10000.0
Maximum170000.0
Zeros0
Zeros (%)0.0%
Memory size78.2 KiB
2020-07-28T16:54:18.606732image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Quantile statistics

Minimum10000
5-th percentile10000
Q130000
median60000
Q370000
95-th percentile120000
Maximum170000
Range160000
Interquartile range (IQR)40000

Descriptive statistics

Standard deviation32681.47347
Coefficient of variation (CV)0.5793560267
Kurtosis0.6254303313
Mean56410
Median Absolute Deviation (MAD)20000
Skewness0.8421766808
Sum564100000
Variance1068078708
2020-07-28T16:54:18.767288image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
60000156415.6%
 
40000149214.9%
 
30000130113.0%
 
70000124012.4%
 
20000102810.3%
 
800007347.3%
 
100006906.9%
 
900004324.3%
 
500003353.4%
 
1000003103.1%
 
1300002812.8%
 
1100002532.5%
 
1200001741.7%
 
170000640.6%
 
160000520.5%
 
150000500.5%
 
ValueCountFrequency (%) 
100006906.9%
 
20000102810.3%
 
30000130113.0%
 
40000149214.9%
 
500003353.4%
 
60000156415.6%
 
70000124012.4%
 
800007347.3%
 
900004324.3%
 
1000003103.1%
 
ValueCountFrequency (%) 
170000640.6%
 
160000520.5%
 
150000500.5%
 
1300002812.8%
 
1200001741.7%
 
1100002532.5%
 
1000003103.1%
 
900004324.3%
 
800007347.3%
 
70000124012.4%
 

TOTAL_CHILDREN
Real number (ℝ≥0)

ZEROS

Distinct count6
Unique (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.846
Minimum0
Maximum5
Zeros2758
Zeros (%)27.6%
Memory size9.9 KiB
2020-07-28T16:54:18.940651image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median2
Q33
95-th percentile5
Maximum5
Range5
Interquartile range (IQR)3

Descriptive statistics

Standard deviation1.613426266
Coefficient of variation (CV)0.8740120615
Kurtosis-0.9306013637
Mean1.846
Median Absolute Deviation (MAD)1
Skewness0.4862396023
Sum18460
Variance2.603144314
2020-07-28T16:54:19.094298image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
0275827.6%
 
2204920.5%
 
1201720.2%
 
4122712.3%
 
3115411.5%
 
57958.0%
 
ValueCountFrequency (%) 
0275827.6%
 
1201720.2%
 
2204920.5%
 
3115411.5%
 
4122712.3%
 
57958.0%
 
ValueCountFrequency (%) 
57958.0%
 
4122712.3%
 
3115411.5%
 
2204920.5%
 
1201720.2%
 
0275827.6%
 

NUMBER_CHILDREN_AT_HOME
Real number (ℝ≥0)

ZEROS

Distinct count6
Unique (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.9978
Minimum0
Maximum5
Zeros6021
Zeros (%)60.2%
Memory size9.9 KiB
2020-07-28T16:54:19.256935image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q32
95-th percentile5
Maximum5
Range5
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.517835809
Coefficient of variation (CV)1.52118241
Kurtosis0.7074371296
Mean0.9978
Median Absolute Deviation (MAD)0
Skewness1.401347865
Sum9978
Variance2.303825543
2020-07-28T16:54:19.413603image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
0602160.2%
 
1136413.6%
 
28448.4%
 
36696.7%
 
45915.9%
 
55115.1%
 
ValueCountFrequency (%) 
0602160.2%
 
1136413.6%
 
28448.4%
 
36696.7%
 
45915.9%
 
55115.1%
 
ValueCountFrequency (%) 
55115.1%
 
45915.9%
 
36696.7%
 
28448.4%
 
1136413.6%
 
0602160.2%
 
Distinct count5
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size78.2 KiB
Bachelors
2842
Partial College
2786
High School
1745
Graduate Degree
1732
Partial High School
895
ValueCountFrequency (%) 
Bachelors284228.4%
 
Partial College278627.9%
 
High School174517.4%
 
Graduate Degree173217.3%
 
Partial High School8958.9%
 
2020-07-28T16:54:19.652120image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Length

Max length19
Median length15
Mean length12.9548
Min length9

Overview of Unicode Properties

Unique unicode characters21
Unique unicode categories (?)3
Unique unicode scripts (?)2
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
e1534211.8%
 
l1473511.4%
 
a1366810.6%
 
o109088.4%
 
r99877.7%
 
h81226.3%
 
80536.2%
 
g71585.5%
 
i63214.9%
 
c54824.2%
 
t54134.2%
 
P36812.8%
 
B28422.2%
 
s28422.2%
 
C27862.2%
 
H26402.0%
 
S26402.0%
 
G17321.3%
 
d17321.3%
 
u17321.3%
 
D17321.3%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter10344279.8%
 
Uppercase Letter1805313.9%
 
Space Separator80536.2%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
P368120.4%
 
B284215.7%
 
C278615.4%
 
H264014.6%
 
S264014.6%
 
G17329.6%
 
D17329.6%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
e1534214.8%
 
l1473514.2%
 
a1366813.2%
 
o1090810.5%
 
r99879.7%
 
h81227.9%
 
g71586.9%
 
i63216.1%
 
c54825.3%
 
t54135.2%
 
s28422.7%
 
d17321.7%
 
u17321.7%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
8053100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin12149593.8%
 
Common80536.2%
 

Most frequent Latin characters

ValueCountFrequency (%) 
e1534212.6%
 
l1473512.1%
 
a1366811.2%
 
o109089.0%
 
r99878.2%
 
h81226.7%
 
g71585.9%
 
i63215.2%
 
c54824.5%
 
t54134.5%
 
P36813.0%
 
B28422.3%
 
s28422.3%
 
C27862.3%
 
H26402.2%
 
S26402.2%
 
G17321.4%
 
d17321.4%
 
u17321.4%
 
D17321.4%
 

Most frequent Common characters

ValueCountFrequency (%) 
8053100.0%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII129548100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
e1534211.8%
 
l1473511.4%
 
a1366810.6%
 
o109088.4%
 
r99877.7%
 
h81226.3%
 
80536.2%
 
g71585.5%
 
i63214.9%
 
c54824.2%
 
t54134.2%
 
P36812.8%
 
B28422.2%
 
s28422.2%
 
C27862.2%
 
H26402.0%
 
S26402.0%
 
G17321.3%
 
d17321.3%
 
u17321.3%
 
D17321.3%
 
Distinct count5
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size78.2 KiB
Licenciatura
2842
Estudios universitarios (en curso)
2786
Educación secundaria
1745
Estudios de postgrado
1732
Educación secundaria (en curso)
895
ValueCountFrequency (%) 
Licenciatura284228.4%
 
Estudios universitarios (en curso)278627.9%
 
Educación secundaria174517.4%
 
Estudios de postgrado173217.3%
 
Educación secundaria (en curso)8958.9%
 
2020-07-28T16:54:19.890806image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Length

Max length34
Median length21
Mean length22.7845
Min length12

Overview of Unicode Properties

Unique unicode characters20
Unique unicode categories (?)5
Unique unicode scripts (?)2
Unique unicode blocks (?)2
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
i2384010.5%
 
s226619.9%
 
u191078.4%
 
a181228.0%
 
c172857.6%
 
r164677.2%
 
162527.1%
 
n145896.4%
 
o144496.3%
 
e136816.0%
 
d132625.8%
 
t118785.2%
 
E71583.1%
 
(36811.6%
 
)36811.6%
 
L28421.2%
 
v27861.2%
 
ó26401.2%
 
p17320.8%
 
g17320.8%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter19423185.2%
 
Space Separator162527.1%
 
Uppercase Letter100004.4%
 
Open Punctuation36811.6%
 
Close Punctuation36811.6%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
E715871.6%
 
L284228.4%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
i2384012.3%
 
s2266111.7%
 
u191079.8%
 
a181229.3%
 
c172858.9%
 
r164678.5%
 
n145897.5%
 
o144497.4%
 
e136817.0%
 
d132626.8%
 
t118786.1%
 
v27861.4%
 
ó26401.4%
 
p17320.9%
 
g17320.9%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
16252100.0%
 

Most frequent Open Punctuation characters

ValueCountFrequency (%) 
(3681100.0%
 

Most frequent Close Punctuation characters

ValueCountFrequency (%) 
)3681100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin20423189.6%
 
Common2361410.4%
 

Most frequent Latin characters

ValueCountFrequency (%) 
i2384011.7%
 
s2266111.1%
 
u191079.4%
 
a181228.9%
 
c172858.5%
 
r164678.1%
 
n145897.1%
 
o144497.1%
 
e136816.7%
 
d132626.5%
 
t118785.8%
 
E71583.5%
 
L28421.4%
 
v27861.4%
 
ó26401.3%
 
p17320.8%
 
g17320.8%
 

Most frequent Common characters

ValueCountFrequency (%) 
1625268.8%
 
(368115.6%
 
)368115.6%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII22520598.8%
 
None26401.2%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
i2384010.6%
 
s2266110.1%
 
u191078.5%
 
a181228.0%
 
c172857.7%
 
r164677.3%
 
162527.2%
 
n145896.5%
 
o144496.4%
 
e136816.1%
 
d132625.9%
 
t118785.3%
 
E71583.2%
 
(36811.6%
 
)36811.6%
 
L28421.3%
 
v27861.2%
 
p17320.8%
 
g17320.8%
 

Most frequent None characters

ValueCountFrequency (%) 
ó2640100.0%
 

FRENCH_EDUCATION
Categorical

Distinct count5
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size78.2 KiB
Bac + 4
2842
Baccalauréat
2786
Bac + 2
1745
Bac + 3
1732
Niveau bac
895
ValueCountFrequency (%) 
Bac + 4284228.4%
 
Baccalauréat278627.9%
 
Bac + 2174517.4%
 
Bac + 3173217.3%
 
Niveau bac8958.9%
 
2020-07-28T16:54:20.133474image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Length

Max length12
Median length7
Mean length8.6615
Min length7

Overview of Unicode Properties

Unique unicode characters18
Unique unicode categories (?)5
Unique unicode scripts (?)2
Unique unicode blocks (?)2
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
a1925322.2%
 
1353315.6%
 
c1278614.8%
 
B910510.5%
 
+63197.3%
 
u36814.2%
 
428423.3%
 
l27863.2%
 
r27863.2%
 
é27863.2%
 
t27863.2%
 
217452.0%
 
317322.0%
 
N8951.0%
 
i8951.0%
 
v8951.0%
 
e8951.0%
 
b8951.0%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter5044458.2%
 
Space Separator1353315.6%
 
Uppercase Letter1000011.5%
 
Math Symbol63197.3%
 
Decimal Number63197.3%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
B910591.0%
 
N8958.9%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
a1925338.2%
 
c1278625.3%
 
u36817.3%
 
l27865.5%
 
r27865.5%
 
é27865.5%
 
t27865.5%
 
i8951.8%
 
v8951.8%
 
e8951.8%
 
b8951.8%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
13533100.0%
 

Most frequent Math Symbol characters

ValueCountFrequency (%) 
+6319100.0%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
4284245.0%
 
2174527.6%
 
3173227.4%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin6044469.8%
 
Common2617130.2%
 

Most frequent Latin characters

ValueCountFrequency (%) 
a1925331.9%
 
c1278621.2%
 
B910515.1%
 
u36816.1%
 
l27864.6%
 
r27864.6%
 
é27864.6%
 
t27864.6%
 
N8951.5%
 
i8951.5%
 
v8951.5%
 
e8951.5%
 
b8951.5%
 

Most frequent Common characters

ValueCountFrequency (%) 
1353351.7%
 
+631924.1%
 
4284210.9%
 
217456.7%
 
317326.6%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII8382996.8%
 
None27863.2%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
a1925323.0%
 
1353316.1%
 
c1278615.3%
 
B910510.9%
 
+63197.5%
 
u36814.4%
 
428423.4%
 
l27863.3%
 
r27863.3%
 
t27863.3%
 
217452.1%
 
317322.1%
 
N8951.1%
 
i8951.1%
 
v8951.1%
 
e8951.1%
 
b8951.1%
 

Most frequent None characters

ValueCountFrequency (%) 
é2786100.0%
 
Distinct count5
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size78.2 KiB
Professional
2824
Skilled Manual
2340
Clerical
1729
Management
1676
Manual
1431
ValueCountFrequency (%) 
Professional282428.2%
 
Skilled Manual234023.4%
 
Clerical172917.3%
 
Management167616.8%
 
Manual143114.3%
 
2020-07-28T16:54:20.389373image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Length

Max length14
Median length12
Mean length10.5826
Min length6

Overview of Unicode Properties

Unique unicode characters21
Unique unicode categories (?)3
Unique unicode scripts (?)2
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
a1544714.6%
 
l1473313.9%
 
e102459.7%
 
n99479.4%
 
i68936.5%
 
o56485.3%
 
s56485.3%
 
M54475.1%
 
r45534.3%
 
u37713.6%
 
P28242.7%
 
f28242.7%
 
S23402.2%
 
k23402.2%
 
d23402.2%
 
23402.2%
 
C17291.6%
 
c17291.6%
 
g16761.6%
 
m16761.6%
 
t16761.6%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter9114686.1%
 
Uppercase Letter1234011.7%
 
Space Separator23402.2%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
M544744.1%
 
P282422.9%
 
S234019.0%
 
C172914.0%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
a1544716.9%
 
l1473316.2%
 
e1024511.2%
 
n994710.9%
 
i68937.6%
 
o56486.2%
 
s56486.2%
 
r45535.0%
 
u37714.1%
 
f28243.1%
 
k23402.6%
 
d23402.6%
 
c17291.9%
 
g16761.8%
 
m16761.8%
 
t16761.8%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
2340100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin10348697.8%
 
Common23402.2%
 

Most frequent Latin characters

ValueCountFrequency (%) 
a1544714.9%
 
l1473314.2%
 
e102459.9%
 
n99479.6%
 
i68936.7%
 
o56485.5%
 
s56485.5%
 
M54475.3%
 
r45534.4%
 
u37713.6%
 
P28242.7%
 
f28242.7%
 
S23402.3%
 
k23402.3%
 
d23402.3%
 
C17291.7%
 
c17291.7%
 
g16761.6%
 
m16761.6%
 
t16761.6%
 

Most frequent Common characters

ValueCountFrequency (%) 
2340100.0%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII105826100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
a1544714.6%
 
l1473313.9%
 
e102459.7%
 
n99479.4%
 
i68936.5%
 
o56485.3%
 
s56485.3%
 
M54475.1%
 
r45534.3%
 
u37713.6%
 
P28242.7%
 
f28242.7%
 
S23402.2%
 
k23402.2%
 
d23402.2%
 
23402.2%
 
C17291.6%
 
c17291.6%
 
g16761.6%
 
m16761.6%
 
t16761.6%
 
Distinct count5
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size78.2 KiB
Profesional
2824
Obrero especializado
2340
Administrativo
1729
Gestión
1676
Obrero
1431
ValueCountFrequency (%) 
Profesional282428.2%
 
Obrero especializado234023.4%
 
Administrativo172917.3%
 
Gestión167616.8%
 
Obrero143114.3%
 
2020-07-28T16:54:20.652479image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Length

Max length20
Median length11
Mean length12.2388
Min length6

Overview of Unicode Properties

Unique unicode characters23
Unique unicode categories (?)3
Unique unicode scripts (?)2
Unique unicode blocks (?)2
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
i1436711.7%
 
o1348811.0%
 
e1295110.6%
 
r120959.9%
 
a92337.5%
 
s85697.0%
 
n62295.1%
 
l51644.2%
 
t51344.2%
 
d40693.3%
 
O37713.1%
 
b37713.1%
 
P28242.3%
 
f28242.3%
 
23401.9%
 
p23401.9%
 
c23401.9%
 
z23401.9%
 
A17291.4%
 
m17291.4%
 
v17291.4%
 
G16761.4%
 
ó16761.4%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter11004889.9%
 
Uppercase Letter100008.2%
 
Space Separator23401.9%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
O377137.7%
 
P282428.2%
 
A172917.3%
 
G167616.8%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
i1436713.1%
 
o1348812.3%
 
e1295111.8%
 
r1209511.0%
 
a92338.4%
 
s85697.8%
 
n62295.7%
 
l51644.7%
 
t51344.7%
 
d40693.7%
 
b37713.4%
 
f28242.6%
 
p23402.1%
 
c23402.1%
 
z23402.1%
 
m17291.6%
 
v17291.6%
 
ó16761.5%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
2340100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin12004898.1%
 
Common23401.9%
 

Most frequent Latin characters

ValueCountFrequency (%) 
i1436712.0%
 
o1348811.2%
 
e1295110.8%
 
r1209510.1%
 
a92337.7%
 
s85697.1%
 
n62295.2%
 
l51644.3%
 
t51344.3%
 
d40693.4%
 
O37713.1%
 
b37713.1%
 
P28242.4%
 
f28242.4%
 
p23401.9%
 
c23401.9%
 
z23401.9%
 
A17291.4%
 
m17291.4%
 
v17291.4%
 
G16761.4%
 
ó16761.4%
 

Most frequent Common characters

ValueCountFrequency (%) 
2340100.0%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII12071298.6%
 
None16761.4%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
i1436711.9%
 
o1348811.2%
 
e1295110.7%
 
r1209510.0%
 
a92337.6%
 
s85697.1%
 
n62295.2%
 
l51644.3%
 
t51344.3%
 
d40693.4%
 
O37713.1%
 
b37713.1%
 
P28242.3%
 
f28242.3%
 
23401.9%
 
p23401.9%
 
c23401.9%
 
z23401.9%
 
A17291.4%
 
m17291.4%
 
v17291.4%
 
G16761.4%
 

Most frequent None characters

ValueCountFrequency (%) 
ó1676100.0%
 
Distinct count5
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size78.2 KiB
Cadre
2824
Technicien
2340
Employé
1729
Direction
1676
Ouvrier
1431
ValueCountFrequency (%) 
Cadre282428.2%
 
Technicien234023.4%
 
Employé172917.3%
 
Direction167616.8%
 
Ouvrier143114.3%
 
2020-07-28T16:54:21.042306image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Length

Max length10
Median length7
Mean length7.4724
Min length5

Overview of Unicode Properties

Unique unicode characters22
Unique unicode categories (?)2
Unique unicode scripts (?)1
Unique unicode blocks (?)2
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
e1061114.2%
 
i946312.7%
 
r73629.9%
 
c63568.5%
 
n63568.5%
 
o34054.6%
 
C28243.8%
 
a28243.8%
 
d28243.8%
 
T23403.1%
 
h23403.1%
 
E17292.3%
 
m17292.3%
 
p17292.3%
 
l17292.3%
 
y17292.3%
 
é17292.3%
 
D16762.2%
 
t16762.2%
 
O14311.9%
 
u14311.9%
 
v14311.9%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter6472486.6%
 
Uppercase Letter1000013.4%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
C282428.2%
 
T234023.4%
 
E172917.3%
 
D167616.8%
 
O143114.3%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
e1061116.4%
 
i946314.6%
 
r736211.4%
 
c63569.8%
 
n63569.8%
 
o34055.3%
 
a28244.4%
 
d28244.4%
 
h23403.6%
 
m17292.7%
 
p17292.7%
 
l17292.7%
 
y17292.7%
 
é17292.7%
 
t16762.6%
 
u14312.2%
 
v14312.2%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin74724100.0%
 

Most frequent Latin characters

ValueCountFrequency (%) 
e1061114.2%
 
i946312.7%
 
r73629.9%
 
c63568.5%
 
n63568.5%
 
o34054.6%
 
C28243.8%
 
a28243.8%
 
d28243.8%
 
T23403.1%
 
h23403.1%
 
E17292.3%
 
m17292.3%
 
p17292.3%
 
l17292.3%
 
y17292.3%
 
é17292.3%
 
D16762.2%
 
t16762.2%
 
O14311.9%
 
u14311.9%
 
v14311.9%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII7299597.7%
 
None17292.3%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
e1061114.5%
 
i946313.0%
 
r736210.1%
 
c63568.7%
 
n63568.7%
 
o34054.7%
 
C28243.9%
 
a28243.9%
 
d28243.9%
 
T23403.2%
 
h23403.2%
 
E17292.4%
 
m17292.4%
 
p17292.4%
 
l17292.4%
 
y17292.4%
 
D16762.3%
 
t16762.3%
 
O14312.0%
 
u14312.0%
 
v14312.0%
 

Most frequent None characters

ValueCountFrequency (%) 
é1729100.0%
 
Distinct count2
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size9.9 KiB
1
6820
0
3180
ValueCountFrequency (%) 
1682068.2%
 
0318031.8%
 

NUMBER_CARS_OWNED
Real number (ℝ≥0)

ZEROS

Distinct count5
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.4897
Minimum0
Maximum4
Zeros2354
Zeros (%)23.5%
Memory size9.9 KiB
2020-07-28T16:54:21.212211image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/