This is my first Machine Learning project and it aims to find a relationship between the financials of a company, as disclosed in its annual financial statements, and how its stock performs in the following year. This project is set up as a supervised regression task.

As an Economics, Finance, and Computer Science student, I have often looked for ways to combine all three of these fields in innovative and practical ways. That was when I got the idea for this project. It is common knowledge that investing in the stock market is not an easy task due to its efficiency and competitiveness. However, I aimed to create a tool that would utilize the power of Machine Learning to potentially make the investing process somewhat more predictable.

In making this project, I faced many difficulties. Most notably, my initial project idea was to use NLP to find a relationship between the top daily news headlines and how the stock market as a whole would perform that same day. However, after countless attempts and using many different models, it became apparent that the algorithm is unable to detect any significant relationship and would thus output a constant value regardless of the news headline input. Then I had to change my project entirely.

In the process of putting together the website where I am hosting this project and setting up the algorithm in itself, I learned a lot, mostly about the importance of formulating an organized work flow for Machine Learning projects. Furthermore, it became clear the importance of data preprocessing and how doing so carefully would be the difference between a good or bad model. After preprocessing my data, I then split it up into train and test sets and began to test different models to see how well they would perform. I tested the following model: Linear Regression, Random Forest Regressor, SVM, Decision Tree Regressor, MLP Regressor, and a CNN using Sequential. Finally, I found that the Random Forest Regressor model performed best and was not overfitting the data, so I chose to use it over the other models.

Finally, to deploy the project, I created a webpage on my personal website with a form built using HTML. The form asks for input from the user, and then using Flask, it passes this on to a Python script that can validate the data and vectorize it, then feed it to the pre-trained model. The model is then able to output a percentage change that is a prediction of how that stock should behave the following year.

Built With

Share this project:
×

Updates