The Data Scientist will work closely with our solution architects, data engineers, ML/MLOps engineers, and business stakeholders to design, develop, refine, improve, and deploy scalable solutions while also working closely with our third-party AI platform provider, SparkBeyond (Baker McKenzie is SparkBeyond's exclusive legal industry partner, creating a unique market opportunity). The Data Scientist should have competencies in predictive modelling, forecasting, machine learning, deep learning, multi-classification, risk analysis, and data pre-processing.

Baker McKenzie is investing in a new team, the Baker McKenzie Machine Learning Practice ("BakerML"), that combines data science, internal and external data, machine learning, and legal domain expertise to create new value for clients, our firm, and our communities around the world. You will be with us from the beginning as a key member of the BakerML team, working on pioneering services and solutions that deliver machine learning enabled legal judgment to support better legal decisions and outcomes. BakerML sits at the forefront of legal services innovation and is at the center of Reinvent, the firm's innovation program. 

BakerML is looking for a Data Scientist with machine learning experience to join our "idea to operate" digital solution practice. You will own the entire data science lifecycle from conception to prototyping, testing, deploying, and measuring of business value. Your work will include defining and developing BakerML offerings through proof of concepts, piloting and delivery of minimal viable solutions, and putting models into production. In addition, you will be responsible for collaboration with the firm's clients and internal stakeholders to identify critical decision points in cross-industry legal processes where expert legal judgment provides measurable business value.


  • Develop code for preparing, extracting, and enriching structured and unstructured data sources, and work with business stakeholders to ensure data suits business-of-law and practice-of-law needs. Conduct exploratory data analysis, produce and test hypotheses, and discover non-obvious relationships between data
  • Develop and implement machine learning algorithms to analyze and model structured and unstructured data. Determine which tools, features, and methodology are most appropriate for the particular business problem by understanding the benefits of different approaches
  • Perform multiple forms of advanced analysis, sustainment, optimization, text analysis, machine learning, and parametric and non-parametric statistical models
  • Explain technical details of a solution, including mathematical formulations, alternatives, and their impact on the modelling approach to business stakeholders across different client industries
  • Work with data engineers to build high-performance data pipelines and machine learning pipelines that can be deployed at scale. Use in-house platforms to develop solutions and build new capabilities
  • Develop technical approaches which yield actionable recommendations for advanced business analytics problems across multiple legal domains (M&A, tax, intellectual property, etc.). Identify patterns, trends, and insights to help drive legal business decision support
  • Work in a fast-paced and dynamic environment with a mix of virtual and face-to-face interaction. Develop structured solutions to problems, manage risks, and document assumptions, while communicating results and educating others through informative visualization, reports, and presentations

Skills and Experience:

  • Strong relevant professional experience who has attained a Master's Degree (Ph.D. preferred) in a quantitative field (such as Data Science, mathematics, economics, computer science, statistics, econometrics, engineering, physics, neuroscience, operations research, etc.)

The successful candidate should also possess the following experience and skills:

  • Ability to apply machine learning techniques to achieve concrete business objectives. Capable of working with business and IT stakeholders to understand the existing resources and constraints around data (sources, risks, integrity, and definitions). Capable of conveying complex business concepts through excellent written and verbal communication
  • Expertise in data preparation methodologies and machine learning algorithms. Solid experience performing data science activities, including data discovery, data cleaning, model selection, validation, and deployment
  • Strong knowledge of artificial intelligence methods and object-oriented programming in a software development process, as well as the ability to restructure, refactor, and optimize code for efficiency
  • Ability to understand ML modelling techniques and their real-world value and limitations (e.g., regression models, causal inference in strategic decisions, multi-objective and multi-modal optimization, advanced Bayesian techniques)
  • Ability to identify, diagnose and correct common failure points in ML model development (i.e., leakage, overfitting, sampling biases, class imbalance)
  • Ability to utilize a diverse range of capabilities, technologies, and tools to deliver actionable insights, including programming fluency in Python, Scala, R, SQL, and Java. Ability to manage large datasets using distributed computing frameworks (Hive/Hadoop, Spark) and emerging Cloud Capabilities on Azure (preferred), GCP or AWS
  • Ability to deploy models into production, with a deep awareness of the related challenges
  • Exceptional ability as a storyteller and communicator with the ability to develop relationships with a broad range of stakeholders
  • Experience leveraging data visualization tools and techniques to highlight patterns, outliers, and exceptional conditions in data
  • Confident with ambiguity and eager to shape the next set of questions to ask and actions to take to add clarity and drive impact