San José State University
Department of Economics

applet-magic.com
Thayer Watkins
Silicon Valley
& Tornado Alley
USA

Estimating the Parameters of a
Bent Line in Regression Analysis

The Unconstrained Case

Suppose y is a function of x but the slope of the relationship changes at x=k1. A regression line for such a function is achieved by defining a new variable x such that

x1 = 0 if x<k1
and
x1 = x − k1 for k1≤x

The variable x1 can be defined more succinctly as x1=u(x-k1), where u(z) is the function such that u(z)=0 if z<0 and u(z)=z for z≥0.

The regression of y on x and x1 gives an equation such as

y = d0 + c0x + c1x1

The coefficient c1 gives the change in the slope of the relationship at k1.

Thus the slopes of the relationship are:

dy/dx = c0 for x<k1
and
dy/dx = c0+c1 for x≥k1

For more than one bendpoint the procedure is analogous and the slope of the relationship is the sum of the coefficients up to the different levels of x.

The Constrained Case

This covers the case in which the slopes of the relationship over different intervals are required to be the same. For example, suppose y is a function of time such that there are bend points at k1 and k2. Furthermore suppose the slope of the relationship in the third interval (from k2<x has to be the same as the slope in the first interval, from 0<x<k1.

Now consider the case in which a regression of y on variable x, z and w has to be of the form

y = a + bx + bz + cw

This form is the same as

y = a + b(x+z) + cw

That is to say, y must be regressed on (x+z) and w. Adding two variables together forces the regression coefficients to be the same.

Likewise if the regression has to be of the form

y = a + bx −bz + cw
then
y = a + b(x-z) +cw

Now consider again the previous example in which the slope in the third interval of a trend line is required to be the sames as the slope in the first interval. First two additional variables x1 and x2 need to be defined as

x1 = u(x-k1)
and
x2 = u(x-k2)

An unconstrained regression would yield an equation of the form

y = d0 + c0x + c1x1 + c22

The slope of the relationship in the first interval is c0 and in the third interval it is c0+c1+c2. For the slopes in the first and third interval to be equal requires that

c0 = c0+c1+c2
which reduces to
c2 = −c1

Therefore the regression equation is of the form2

y = d0 + c0x + c1x1 − c1x2
which reduces to
y = d0 + c0x + c1(x1−x2)

Thus y must be regressed on x and (x1−x2).

This method can be generalized.

Suppose the relationship cyclic relationship is of the form

Then the regression would use a generated variable of the form

The other variable in the regression would just be the trend variable.


HOME PAGE OF applet-magic
HOME PAGE OF Thayer Watkins