Solid primer on model selection. The library/messy pile analogy for structured vs unstructured data is spot-on. One thing that really resonates is the point about not defaulting to LLMs for everything, I've seen teams waste weeks trying to force GPT-4 onto tabular forecasting problems when a simple XGBoost model would've shipped faster and performed bettter. The decision tree at the end is practical for those who dunno where to start.
Solid primer on model selection. The library/messy pile analogy for structured vs unstructured data is spot-on. One thing that really resonates is the point about not defaulting to LLMs for everything, I've seen teams waste weeks trying to force GPT-4 onto tabular forecasting problems when a simple XGBoost model would've shipped faster and performed bettter. The decision tree at the end is practical for those who dunno where to start.
Thank you for reading this post and share your feedback :)