Tommaso Colella

Self-hosted Copilot using Google Gemma and Ollama (+ Phi-2 comparison)

It’s impossible to keep up with the rapid developments in the field of LLMs. On Feb 21, Google released a new family of models called Gemma. These models promise top performance for their size. According to the Gemma technical report, the 7B model outperforms all other open-weight models of the same or even bigger sizes, such as Llama-2 13B. 💡According to the technical report accompanying Gemma’s announcement, the new models were trained using the same methodologies as Google’s top-tier Gemini models ...

DIY self-hosted Copilot using Phi-2 and Ollama

On December 12th, Microsoft released their latest “SML” or “Small Language Model” Phi-2. This new model is MIT-Licensed. The permissive license makes it a perfect candidate for any experimentation, be it academic or commercial. Phi-2 is also a somewhat “green” model. The model was trained with a lot less power than some of its bigger cousins LLama-2 or Mistral, to name a few. Well, it’s not THAT small (it has 2.7 billion parameters), but it is at least 2 to 3 orders of magnitude smaller than the state-of-the-art model GPT4 by OpenAI (public data on the internals of GPT4 is not available, so we can only raise conjectures). ...

Deploying AzureOpenAI Service using OpenTofu

Lately, some friends have asked me to cooperate on a scientific paper regarding LLMs. As part of this collaboration, I had to assess the feasibility of the automatic deployment of an LLM on a public cloud. I’m lucky enough to work at a company focusing on AI and Microsoft-oriented system integration, and we’re always looking for new ways to bring value to the market using cutting-edge services. For this reason, I’ve been tinkering with the AzureOpenAi Service for a while: it offers business users the possibility to leverage the power of OpenAi’s most advanced models, such as GPT-4 Turbo, without sharing confidential organization data with third parties. ...