From Java Developer to AI Architect at IBM
After 11 years of Java, I switched to AI. I did not have to start over as a junior.
In 2014, I was at IBM Littleton Lab in Massachusetts for a training on Watson. I was learning how to build a Q&A system for IBM customer support. I had spent nine years building Java applications. I knew nothing about AI beyond a graduate course on data mining and machine learning. Watching Watson generate answers, I was amazed. The technology was just starting out, but I could see where things were heading.
That week was my first exposure to AI. Two years later, I made the full transition.
My fear was stepping into territory with no clear path to production. But I had always tried new technology early, even in my Java career. I knew that if I stayed comfortable, I would be fine for a few years. Eventually, my growth would stop. I was not willing to let that happen.
The Java Years
For 11 years, my world was enterprise Java. I built web applications and web services, deployed them to production, and troubleshot server failures. I developed a search engine from scratch. I set up a wiki for a large organization. My days were writing code, testing, deploying, and fixing what broke.
I was not just an individual contributor. I led a team of Java developers, system administrators, and DBAs. I was hands-on with code and responsible for technical leadership and architecture. Doing both shaped how I think about building systems and leading people.
It was good work. I was good at it. Management was happy, and I received top ratings.
But at some point, the growth slowed down. The work became routine: add features to the front end, crawl more data sources into the search engine, repeat. I was not feeling challenged. I was not innovating. And I knew it.
The Turning Point
By 2016, “Data Science” and “Cognitive Computing” were impossible to ignore. AI had carried a negative reputation from its earlier failures, but deep learning was changing that. Watson had won Jeopardy. Kaggle competitions were producing impressive results. Neural networks were going through a transformation with modern architectures.
I did not understand most of it. But I could see AI was beginning to matter. It was the earliest stage of adoption. Most companies were hesitant to explore AI. Success stories were rare.
Then, at the end of September 2016, before a three-week vacation, my manager Chad Marston pinged me. He said, “Shaikh, we want to begin working in Cognitive Computing. I want you to lead that team.”
I would be building a team of 8 to 10 developers, and none of us had any background in AI. We all came from Java and web development.
I went on vacation, but I kept thinking about it. When I came back, I was already exploring what I could learn and where I could get hands-on with this technology.
Chad sent me to the IBM Watson Developer Conference in San Francisco later that year. I attended hands-on labs where I played with Watson APIs. I saw demos. I listened to talks from Ginni Rometty, IBM’s CEO, and other experts. I had seen Watson before, but this was different. I saw more of what AI could do in practice. I came back with confidence and a clearer picture of what we could build.
The Hard Part
Leading 8 to 10 people while learning myself was not easy. Online learning platforms had plenty of courses, still overwhelming. Where do you start? What is worth your time? What will matter when you try to apply it?
The first thing I did was set up regular check-ins with my mentors at IBM: Jean-Francois Puget, Jorge Castañón, and Seth Dobrin. I got their guidance, filtered courses based on their recommendations, put together a learning path, and shared it with the team.
Even with structure, the learning curve was steep. The concept I struggled with longest was evaluation metrics. Accuracy was easy. Precision and recall were harder, and harder still to explain to management. The harder problem was mapping model metrics to business metrics. A model can perform well on test data, but how do you know it is helping the business?
I have to admit: our first project failed. We were asked to predict customer satisfaction scores from support interactions. The model was accurate in training and testing. We deployed it. But we had no way to monitor business success. We failed quickly and learned from it.
Many questions from team members, I did not know the answers. I did not make things up. I admitted I did not know, then went and learned and came back. That honesty became part of how we operated.
The learning was hard for the whole team. We set up a weekly session where we watched videos together and discussed what we learned. Often, one team member would pre-read the material and present it. This kept us progressing together.
What Helped Us Learn
I tried many courses across Coursera, edX, Udemy, and DataCamp. A few stood out:
Andrew Ng’s Machine Learning course gave me a foundation in core algorithms: decision trees, linear and logistic regression, support vector machines, and neural networks. It was taught in Octave, not Python, but the concepts were what mattered.
Python for Data Science from DataCamp gave me fluency in Python once I had the conceptual foundation.
Two books were essential: An Introduction to Statistical Learning for statistical foundations, and Python Machine Learning by Sebastian Raschka for bridging theory to implementation.
But the most important learning did not come from courses. It came from applying concepts to real projects, getting stuck, and consulting mentors. That cycle of building, struggling, and asking for help was the core of my growth.
Where I Am Now
My AI journey started in 2014 with that Watson project. In fall 2016, I made the full transition, initially working with Watson’s NLP capabilities: sentiment analysis, question answering, and similar APIs. I became the lead machine learning engineer for IBM Analytics and held that role for three years. In 2019, I moved into my current role: AI Architect for IBM Db2.
My work today goes beyond building models. It has two dimensions. The first is improving database operations using AI, for example, applying machine learning to improve memory estimation for queries. The second is bringing AI infrastructure into the database itself, so that database administrators can run AI workloads directly within Db2.
Building AI into a database is different from building standalone models. The model must be small, fast, and light on resources. No human monitors these models, so they need self-feedback mechanisms. These were constraints I never faced before.
One project shows what my role looks like now: bringing vector support to Db2. At the beginning of 2025, Db2 had no vector capabilities. I owned this project end to end: planning, designing, architecting, leading the team, and contributing to development. We implemented vector type support, vector similarity search, and LLM API integration callable via SQL. We released it, then I created educational materials, gave talks, wrote blog posts, and promoted the work.
That cycle, from zero to shipped product to education to promotion, is what makes this role rewarding. I am challenged every day. I am not just building models. I am thinking about how AI fits into an enterprise database platform and how to make it useful for production workloads. I am more satisfied now than I ever was in my Java years.
What I Would Tell Someone Making This Transition
Find a mentor early. If you want to become an AI architect, find someone in that role. Get their guidance on what to learn and what to skip. This will save you months.
Your experience is not a liability. I did not abandon my 11 years of Java. I built on it. Production systems, working with teams, managing projects: all of that transferred. The AI skills were an addition, not a replacement.
Your learning approach should change over time. Early on, I took courses end to end. That was useful for building foundations. But after the fundamentals, I changed my approach. Now I start with the project, then learn only what I need to apply. In 2021, I took an entire specialization on cloud deployment of AI. I never applied most of it, and those skills are gone. Project-driven learning is more effective once you have the foundations.
Model accuracy is not the end goal. I used to think a highly accurate model was all that mattered. I was wrong. What matters is identifying the right business metric and tying it to your model’s performance. A model can look great in testing but have no impact if you have not made that connection.
User adoption is the part nobody warns you about. Even after you get the model and metrics right, users fall back to old routines. Deploying AI is not just a technical problem. It requires rolling it out slowly and getting buy-in. Skip this, and your model sits unused.
Questions to Ask Yourself
1. Who do you know who is already an AI architect? Write down three names. This week, reach out to one of them with a single question about how they got there. If you do not have someone in mind and think I could help, reach out to me.
2. What skills from your current role would transfer to an AI architect role? List three.
3. What have you learned in the past two years that you never applied? Going forward, learn less that way.
4. What problem in your current work could benefit from AI? Name the problem and the outcome you want.
5. If you became an AI architect today, what is the first project you would want to work on?
Your Turn
If you are a senior engineer feeling that growth has stopped, the path I took is available to you. You do not need to go back to school. You do not need to start over as a junior. You need a structured path, the right guidance, and willingness to be uncomfortable for a while.
If this post was useful, follow me. I will publish more on lessons from my transition, with steps you can apply.
If you are considering a move to an AI architect role, I would like to hear from you. What are your struggles? What is holding you back? Share in the comments. Your questions will shape what I write next.







I am considering 😃