DZone
Big Data Zone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
  • Refcardz
  • Trend Reports
  • Webinars
  • Zones
  • |
    • Agile
    • AI
    • Big Data
    • Cloud
    • Database
    • DevOps
    • Integration
    • IoT
    • Java
    • Microservices
    • Open Source
    • Performance
    • Security
    • Web Dev
DZone > Big Data Zone > Inequalities and Quantile Regression

Inequalities and Quantile Regression

Arthur Charpentier user avatar by
Arthur Charpentier
·
Feb. 18, 15 · Big Data Zone · Interview
Like (0)
Save
Tweet
4.44K Views

Join the DZone community and get the full member experience.

Join For Free

in the course on inequality measures , we’ve seen how to compute various (standard) inequality indices, based on some sample of incomes (that can be binned, in various categories). on thursday, we discussed the fact that incomes can be related to different variables (e.g. experience), and that comparing income inequalities between countries can be biased, if they have very different age structures.

so we’ve seen quantile regressions. i can mention some old slides (used in a crash course at mcgill three years ago), as well as a more technical discussion on ties, and non-unicity of the regression line.

in order to illustrate, consider the following dataset:

> salary <- read.table("http://data.princeton.edu/wws509/datasets/salary.dat",header=true)
> plot(salary$yd,salary$sl)
> abline(lm(sl~yd,data=salary),col="blue")

we have here the standard regression line , obtained using ordinary least squares. here we have the expected income given the experience. but we can also use a quantile regression:

http://latex.codecogs.com/gif.latex?q_\tau(y\vert\boldsymbol{x})=\boldsymbol{x}^{\text{\sffamily%20t}}\boldsymbol{\beta}

> library(quantreg)
> q10 <- rq(sl~yd,data=salary,tau=.1)
> q90 <- rq(sl~yd,data=salary,tau=.9)
> abline(q10,col="red")
> abline(q90,col="purple")

a classical tool to describe inequalities is the ratio of the 90% quantile over the 10% quantile (among so many others):

> ratio9010 = function(age){
+   predict(q90,newdata=data.frame(yd=age))/
+   predict(q10,newdata=data.frame(yd=age))
+ }

for instance, among people with 5 years of experience, there is an inequality index of

> ratio9010(5)
1.401749

while for people with 30 years of experience, it would be

> ratio9010(30)
1.9488

if we plot the evolution of this 90-10 ratio, as a function of the experience, we get the following increasing trend:

> a=0:30
> plot(a,vectorize(ratio9010)(a),type="l",ylab="90-10 quantile ratio")

so clearly, comparing inequalitis ceteris paribus between two groups, should be performed carefully, and probably including some covariates.


Measure (physics) trends Crash (computing)

Published at DZone with permission of Arthur Charpentier, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Popular on DZone

  • Creating Event-Based Architecture on Top of Existing API Ecosystem
  • What Is HttpSession in Servlets?
  • Adaptive Change Management: A DevOps Approach to Change Management
  • Java Outsourcing, a Strong Business, and Management Approaches

Comments

Big Data Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • MVB Program
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends:

DZone.com is powered by 

AnswerHub logo