Polynomial regression is a nonlinear relationship between independent x and dependent y variables. Fitting such type of regression is essential when we analyze fluctuated data with some bends.

In this post, we'll learn how to fit and plot polynomial regression data in R. We use an lm() function in this regression model. Although it is a linear regression model function, lm() works well for polynomial models by changing the target formula type. The tutorial covers:

- Preparing the data
- Fitting the model
- Finding the best fit
- Source code listing

**Preparing the data**

We'll start by preparing test data for this tutorial as below.

`peq = function(x) x^3+2*x^2+5`

` `

```
x = seq(-0.99, 1, by = .01)
y = peq(x) + runif(200)
```

```
df = data.frame(x = x, y = y)
head(df)
```

` `

```
x y
1 -0.99 6.635701
2 -0.98 6.290250
3 -0.97 6.063431
4 -0.96 6.632796
5 -0.95 6.634153
6 -0.94 6.896084
```

We can visualize the the 'df' data to check visually in a plot. Our task is to fit this data with the best curve.

`plot(df$x, df$y, pch=20, col="gray")`

` `

**Fitting**

**the model**

We build a model with lm() function with a formula. I(x^2) represents x

^{2}in a formula. We can also use poly(x,2) function and it is the same with the expression of I(x^2).

```
model = lm(y~x+I(x^3)+I(x^2), data = df)
summary(model)
Call:
lm(formula = y ~ x + I(x^3) + I(x^2), data = df)
Residuals:
Min 1Q Median 3Q Max
-0.49598082 -0.21488892 -0.01301059 0.18515573 0.58048188
Coefficients:
Estimate Std. Error t value
(Intercept) 4.3634157 0.1091087 39.99144
x -0.1078152 0.9309088 -0.11582
I(x^3) -0.5925309 1.3905638 -0.42611
I(x^2) 3.6462591 2.1359770 1.70707
Pr(>|t|)
(Intercept) < 0.0000000000000002 ***
x 0.908039
I(x^3) 0.670983
I(x^2) 0.091042 .
---
Signif. codes:
0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.2626079 on 96 degrees of freedom
Multiple R-squared: 0.9243076, Adjusted R-squared: 0.9219422
F-statistic: 390.7635 on 3 and 96 DF, p-value: < 0.00000000000000022204
```

Next, we'll predict data with a trained model.

`pred = predict(model,data=df)`

` `

**Finding the best fit **

Finding the best-fitted curve is important. We check the model with various possible functions. Here, we apply four types of function to fit and check their performance.

The orange line (linear regression) and yellow curve are the wrong choices for this data. The pink curve is close, but the blue curve is the best match for our data trend. Thus, I use the y~x^{3}+x^{2} formula to build our polynomial regression model.

You may find the best-fit formula for your data by visualizing them in a plot.

The source code of a plot is listed below.

```
windows(width=8, height=6)
plot(x=df$x, y=df$y, pch=20, col="grey")
lines(df$x, predict(lm(y~x, data=df)), type="l", col="orange1", lwd=2)
lines(df$x, predict(lm(y~I(x^2), data=df)), type="l", col="pink1", lwd=2)
lines(df$x, predict(lm(y~I(x^3), data=df)), type="l", col="yellow2", lwd=2)
lines(df$x, predict(lm(y~poly(x,3)+poly(x,2), data=df)), type="l", col="blue", lwd=2)
legend("topleft",
legend = c("y~x, - linear","y~x^2", "y~x^3", "y~x^3+x^2"),
col = c("orange","pink","yellow","blue"),
lty = 1, lwd=3
)
```

**Plotting the result**

1. Plotting with a plot() function.

```
pred = predict(model,data = df)
lines(df$x, pred, lwd = 3, col = "blue")
```

2. Plotting with a ggplot().

Polynomial regression data can be easily fitted and plotted with ggplot().

`library(ggplot2)`

`ggplot(data=df, aes(x,y)) +`

` geom_point() + `

` geom_smooth(method="lm", formula=y~I(x^3)+I(x^2))`

In this tutorial, we have briefly learned how to fit polynomial regression data and plot the results with a plot() and ggplot() functions in R. The full source code is listed below.

**Source code listing**

`peq = function(x) x^3+2*x^2+5`

` `

```
x = seq(-0.99, 1, by = .01)
y = peq(x) + runif(200)
```

```
df = data.frame(x = x, y = y)
head(df)
```

`plot(df$x, df$y, pch=20, col="gray")`

` `

```
model = lm(y~x+I(x^3)+I(x^2), data = df)
summary(model)
```

` `

`pred = predict(model,data=df) `

` `

```
windows(width=8, height=6)
plot(x=df$x, y=df$y, pch=20, col="grey")
lines(df$x, predict(lm(y~x, data=df)), type="l", col="orange1", lwd=2)
lines(df$x, predict(lm(y~I(x^2), data=df)), type="l", col="pink1", lwd=2)
lines(df$x, predict(lm(y~I(x^3), data=df)), type="l", col="yellow2", lwd=2)
lines(df$x, predict(lm(y~poly(x,3)+poly(x,2), data=df)), type="l", col="blue", lwd=2)
legend("topleft",
legend = c("y~x, - linear","y~x^2", "y~x^3", "y~x^3+x^2"),
col = c("orange","pink","yellow","blue"),
lty = 1, lwd=3
)
```

` `

```
pred = predict(model,data = df)
lines(df$x, pred, lwd = 3, col = "blue")
```

```
```

`library(ggplot2) `

`ggplot(data=df, aes(x,y)) + geom_point() `

` + geom_smooth(method="lm", formula=y~I(x^3)+I(x^2))`

Drawing trend lines is one of the few easy techniques that really WORK. Prices respect a trend line, or break through it resulting in a massive move. Drawing good trend lines is the MOST REWARDING skill.

ReplyDeleteThe problem is, as you may have already experienced, too many false breakouts. You see trend lines everywhere, however not all trend lines should be considered. You have to distinguish between STRONG and WEAK trend lines.

One good guideline is that a strong trend line should have AT LEAST THREE touching points. Trend lines with more than four touching points are MONSTER trend lines and you should be always prepared for the massive breakout!

This sophisticated software automatically draws only the strongest trend lines and recognizes the most reliable chart patterns formed by trend lines...

http://www.forextrendy.com?kdhfhs93874

Chart patterns such as "Triangles, Flags and Wedges" are price formations that will provide you with consistent profits.

Before the age of computing power, the professionals used to analyze every single chart to search for chart patterns. This kind of analysis was very time consuming, but it was worth it. Now it's time to use powerful dedicated computers that will do the job for you:

http://www.forextrendy.com?kdhfhs93874