Model

Qwen 2.5 3b Text to SQL

Fine-Tuned LLM for Text-to-SQL Conversion

1,100+ Downloads
Qwen 2.5 3b Text to SQL

Model Description

This model is a fine-tuned version of Qwen/Qwen2.5-3B-Instruct designed to convert natural language queries into SQL statements. With over 1,100 combined downloads, it has been trained on the gretelai/synthetic_text_to_sql dataset and can provide both SQL queries and table schema context when needed.

Model Details

  • Base Model: Qwen/Qwen2.5-3B-Instruct
  • Dataset: Gretel AI Synthetic Text-to-SQL Dataset
  • Language: English
  • License: MIT
  • Community Adoption: Over 1,100 downloads across all versions

Key Features

  • Text-to-SQL Conversion: Converts natural language queries into accurate SQL statements.
  • Schema Generation: Generates table schema context when none is provided.
  • Optimized for Analytics and Reporting: Handles SQL queries with aggregation, grouping, and filtering.

Usage

Direct Use

To use the model for text-to-SQL conversion, you can load it using the transformers library as shown below:

from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("Ellbendls/Qwen-2.5-3b-Text_to_SQL-GGUF")
model = AutoModelForCausalLM.from_pretrained("Ellbendls/Qwen-2.5-3b-Text_to_SQL-GGUF")

# Input prompt
query = "What is the total number of hospital beds in each state?"

# Tokenize input and generate output
inputs = tokenizer(query, return_tensors="pt")
outputs = model.generate(**inputs, max_length=512)

# Decode and print
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Example Output

Input:

What is the total number of hospital beds in each state?

Output:

Context: CREATE TABLE Beds (State VARCHAR(50), Beds INT); INSERT INTO Beds (State, Beds) VALUES ('California', 100000), ('Texas', 85000), ('New York', 70000); SQL Query: SELECT State, SUM(Beds) FROM Beds GROUP BY State;

Training Details

Dataset

The model was fine-tuned on the gretelai/synthetic_text_to_sql dataset, which includes diverse natural language queries mapped to SQL queries, with optional schema contexts.

Limitations

  • Complex Queries: May struggle with highly nested or advanced SQL tasks.
  • Non-English Prompts: Optimized for English only.
  • Context Dependence: May generate incorrect schemas without explicit instructions.