## Pages

### Fitting Polynomial Regression Data in R

Polynomial regression is a nonlinear relationship between independent x and dependent y variables. Fitting such type of regression is essential when we analyze fluctuated data with some bends.

In this post, we'll learn how to fit and plot polynomial regression data in R. We use an lm() function in this regression model. Although it is a linear regression model function, lm() works well for polynomial models by changing the target formula type. The tutorial covers:

1. Preparing the data
2. Fitting the model
3. Finding the best fit
4. Source code listing

Preparing the data

We'll start by preparing test data for this tutorial as below.

peq = function(x) x^3+2*x^2+5

x = seq(-0.99, 1, by = .01)
y = peq(x) + runif(200)

df = data.frame(x = x, y = y)

x        y
1 -0.99 6.635701
2 -0.98 6.290250
3 -0.97 6.063431
4 -0.96 6.632796
5 -0.95 6.634153
6 -0.94 6.896084

We can visualize the the 'df' data to check visually in a plot. Our task is to fit this data with the best curve.

plot(df\$x, df\$y, pch=20, col="gray")

Fitting the model

We build a model with lm() function with a formula. I(x^2) represents x2 in a formula. We can also use poly(x,2) function and it is the same with the expression of I(x^2).

model = lm(y~x+I(x^3)+I(x^2), data = df)
summary(model)

Call:
lm(formula = y ~ x + I(x^3) + I(x^2), data = df)

Residuals:
Min          1Q      Median          3Q         Max
-0.49598082 -0.21488892 -0.01301059  0.18515573  0.58048188

Coefficients:
Estimate Std. Error  t value
(Intercept)  4.3634157  0.1091087 39.99144
x           -0.1078152  0.9309088 -0.11582
I(x^3)      -0.5925309  1.3905638 -0.42611
I(x^2)       3.6462591  2.1359770  1.70707
Pr(>|t|)
(Intercept) < 0.0000000000000002 ***
x                       0.908039
I(x^3)                  0.670983
I(x^2)                  0.091042 .
---
Signif. codes:
0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.2626079 on 96 degrees of freedom
Multiple R-squared:  0.9243076, Adjusted R-squared:  0.9219422
F-statistic: 390.7635 on 3 and 96 DF,  p-value: < 0.00000000000000022204

Next, we'll predict data with a trained model.

pred = predict(model,data=df)

Finding the best fit

Finding the best-fitted curve is important. We check the model with various possible functions. Here, we apply four types of function to fit and check their performance.

The orange line (linear regression) and yellow curve are the wrong choices for this data. The pink curve is close, but the blue curve is the best match for our data trend. Thus, I use the y~x3+x2 formula to build our polynomial regression model.
You may find the best-fit formula for your data by visualizing them in a plot.

The source code of a plot is listed below.

windows(width=8, height=6)
plot(x=df\$x, y=df\$y, pch=20, col="grey")

lines(df\$x, predict(lm(y~x, data=df)), type="l", col="orange1", lwd=2)
lines(df\$x, predict(lm(y~I(x^2), data=df)), type="l", col="pink1", lwd=2)
lines(df\$x, predict(lm(y~I(x^3), data=df)), type="l", col="yellow2", lwd=2)
lines(df\$x, predict(lm(y~poly(x,3)+poly(x,2), data=df)), type="l", col="blue", lwd=2)

legend("topleft",
legend = c("y~x,  - linear","y~x^2", "y~x^3", "y~x^3+x^2"),
col = c("orange","pink","yellow","blue"),
lty = 1, lwd=3
)

Plotting the result

1. Plotting with a plot() function.

pred = predict(model,data = df)
lines(df\$x, pred, lwd = 3, col = "blue")

2. Plotting with a ggplot().

Polynomial regression data can be easily fitted and plotted with ggplot().

library(ggplot2)
ggplot(data=df, aes(x,y)) +
geom_point() +
geom_smooth(method="lm", formula=y~I(x^3)+I(x^2))

In this tutorial, we have briefly learned how to fit polynomial regression data and plot the results with a plot() and ggplot() functions in R. The full source code is listed below.

Source code listing

peq = function(x) x^3+2*x^2+5

x = seq(-0.99, 1, by = .01)
y = peq(x) + runif(200)

df = data.frame(x = x, y = y)

plot(df\$x, df\$y, pch=20, col="gray")

model = lm(y~x+I(x^3)+I(x^2), data = df)
summary(model)

pred = predict(model,data=df)

windows(width=8, height=6)
plot(x=df\$x, y=df\$y, pch=20, col="grey")

lines(df\$x, predict(lm(y~x, data=df)), type="l", col="orange1", lwd=2)
lines(df\$x, predict(lm(y~I(x^2), data=df)), type="l", col="pink1", lwd=2)
lines(df\$x, predict(lm(y~I(x^3), data=df)), type="l", col="yellow2", lwd=2)
lines(df\$x, predict(lm(y~poly(x,3)+poly(x,2), data=df)), type="l", col="blue", lwd=2)

legend("topleft",
legend = c("y~x,  - linear","y~x^2", "y~x^3", "y~x^3+x^2"),
col = c("orange","pink","yellow","blue"),
lty = 1, lwd=3

pred = predict(model,data = df)
lines(df\$x, pred, lwd = 3, col = "blue")

library(ggplot2)
ggplot(data=df, aes(x,y)) + geom_point()
+ geom_smooth(method="lm", formula=y~I(x^3)+I(x^2))

#### 1 comment:

1. Drawing trend lines is one of the few easy techniques that really WORK. Prices respect a trend line, or break through it resulting in a massive move. Drawing good trend lines is the MOST REWARDING skill.

The problem is, as you may have already experienced, too many false breakouts. You see trend lines everywhere, however not all trend lines should be considered. You have to distinguish between STRONG and WEAK trend lines.

One good guideline is that a strong trend line should have AT LEAST THREE touching points. Trend lines with more than four touching points are MONSTER trend lines and you should be always prepared for the massive breakout!

This sophisticated software automatically draws only the strongest trend lines and recognizes the most reliable chart patterns formed by trend lines...

http://www.forextrendy.com?kdhfhs93874

Chart patterns such as "Triangles, Flags and Wedges" are price formations that will provide you with consistent profits.

Before the age of computing power, the professionals used to analyze every single chart to search for chart patterns. This kind of analysis was very time consuming, but it was worth it. Now it's time to use powerful dedicated computers that will do the job for you:

http://www.forextrendy.com?kdhfhs93874