You can get 10 to 20 points
📒 Every project must include brief description of the dataset so you get to know the data you are working with
🔎 What model performance metrics have you decided to use?
🎯 Try minimally 2 different models
✅ Mandatory part of every project is a summary at the end in which you summarize the most interesting insight obtained.
Upload a 📝 Jupyter Notebook with descriptions included or a PDF report + source codes.
💡 Estimated time for the project is 5-10h, this value heavily depends on your skill, but you can use it as a guidance for a project size.
🎯 Deadline is 26. 5. 2024 🍀
Upload the project to:
Dropbox:
💡 Dropbox will ask you for your name - use your VSB login please 👍
https://www.kaggle.com/datasets/kazanova/sentiment140
You can select smaller subset of the dataset as it contains pretty high number of tweets.
https://www.kaggle.com/datasets/rmisra/news-category-dataset
https://www.kaggle.com/datasets/datatattle/covid-19-nlp-text-classification
https://www.kaggle.com/datasets/andrewmvd/trip-advisor-hotel-reviews
It is highly welcomed if you use a dataset of your individual choice. In this case just drop me an e-mail so I can approve your choice or give some recommendations.
You can find plenty of datasets on Kaggle
If you work with RNN in your semestral project, it is possible to use it instead of this project, even if is not a classification task. Just drop me an e-mail and we will discuss it individually.