RatingConcept#
RatingConcept
is a specialized concept type that calculates, infers, and derives rating values from documents within a clearly defined numerical scale.
📝 Overview#
RatingConcept
enables sophisticated rating analysis from documents, allowing you to:
Derive implicit ratings: Calculate ratings based on sentiment analysis, key criteria, or contextual evaluation
Generate evaluative scores: Produce numerical assessments that quantify quality, relevance, or performance
Normalize diverse signals: Convert qualitative assessments into consistent numerical ratings within your defined scale
Synthesize overall scores: Combine multiple factors or opinions into comprehensive rating assessments
This concept type is particularly valuable for generating evaluative information from documents such as:
Product and service reviews where sentiment must be quantified on a standardized scale
Performance assessments requiring numerical quality or satisfaction scoring
Risk evaluations needing severity or probability measurements
Content analyses where subjective characteristics must be rated objectively
💻 Usage Example#
Here’s a simple example of how to use RatingConcept
to extract a product rating:
# ContextGem: RatingConcept Extraction
import os
from contextgem import Document, DocumentLLM, RatingConcept, RatingScale
# Create a Document object from text describing a product without an explicit rating
smartphone_description = (
"This smartphone features a 5000mAh battery that lasts all day with heavy use. "
"The display is 6.7 inch AMOLED with 120Hz refresh rate. "
"Camera system includes a 50MP main sensor, 12MP ultrawide, and 8MP telephoto lens. "
"The phone runs on the latest processor with 8GB RAM and 256GB storage. "
"It has IP68 water resistance and Gorilla Glass Victus protection."
)
doc = Document(raw_text=smartphone_description)
# Define a RatingConcept that requires analysis to determine a rating
product_quality = RatingConcept(
name="Product Quality Rating",
description=(
"Evaluate the overall quality of the smartphone based on its specifications, "
"features, and adherence to industry best practices"
),
rating_scale=RatingScale(start=1, end=10),
add_justifications=True, # include justification for the rating
justification_depth="balanced",
justification_max_sents=5,
)
# Attach the concept to the document
doc.add_concepts([product_quality])
# Configure DocumentLLM with your API parameters
llm = DocumentLLM(
model="azure/gpt-4.1",
api_key=os.getenv("CONTEXTGEM_AZURE_OPENAI_API_KEY"),
api_version=os.getenv("CONTEXTGEM_AZURE_OPENAI_API_VERSION"),
api_base=os.getenv("CONTEXTGEM_AZURE_OPENAI_API_BASE"),
)
# Extract the concept from the document - the LLM will analyze and assign a rating
product_quality = llm.extract_concepts_from_document(doc)[0]
# Print the calculated rating
print(f"Quality Rating: {product_quality.extracted_items[0].value}")
# Print the justification
print(f"Justification: {product_quality.extracted_items[0].justification}")
⚙️ Parameters#
When creating a RatingConcept
, you can specify the following parameters:
Parameter |
Type |
Description |
---|---|---|
|
str |
A unique name identifier for the concept |
|
str |
A clear description of what should be evaluated and rated, including the criteria for assigning different values within the rating scale (e.g., “Evaluate product quality based on features, durability, and performance where 1 represents poor quality and 10 represents exceptional quality”). The more specific the description, the more consistent and accurate the ratings will be. |
|
Defines the boundaries for valid ratings (e.g., |
|
|
str |
The role of the LLM responsible for extracting the concept. Available values: |
|
bool |
Whether to include justifications for extracted items (defaults to |
|
str |
Justification detail level. Available values: |
|
int |
Maximum sentences in a justification (defaults to |
|
bool |
Whether to include source references for extracted items (defaults to |
|
str |
Source reference granularity. Available values: |
|
bool |
Whether this concept is restricted to having only one extracted item. If |
|
dict |
Optional. Dictionary for storing any additional data that you want to associate with the concept. This data must be JSON-serializable. This data is not used for extraction but can be useful for custom processing or downstream tasks. Defaults to an empty dictionary. |
🚀 Advanced Usage#
🔍 References and Justifications for Extraction#
When extracting a RatingConcept
, it’s often useful to include justifications to understand the reasoning behind the score:
# ContextGem: RatingConcept Extraction with References and Justifications
import os
from contextgem import Document, DocumentLLM, RatingConcept, RatingScale
# Sample document text about a software product with various aspects
software_review = """
Software Review: ProjectManager Pro 5.0
User Interface: The interface is clean and modern, with intuitive navigation. New users can quickly find what they need without extensive training. The dashboard provides a comprehensive overview of project status.
Performance: The application loads quickly even with large projects. Resource-intensive operations like generating reports occasionally cause minor lag on older systems. The mobile app performs exceptionally well, even on limited bandwidth.
Features: Project templates are well-designed and cover most common project types. Task dependencies are easily managed, and the Gantt chart visualization is excellent. However, the software lacks advanced risk management tools that competitors offer.
Support: The documentation is comprehensive and well-organized. Customer service response time averages 4 hours, which is acceptable but not industry-leading. The knowledge base needs more video tutorials.
"""
# Create a Document from the text
doc = Document(raw_text=software_review)
# Create a RatingConcept with justifications and references enabled
usability_rating_concept = RatingConcept(
name="Software usability rating",
description="Evaluate the overall usability of the software on a scale of 1-10 based on UI design, intuitiveness, and learning curve",
rating_scale=RatingScale(start=1, end=10),
add_justifications=True, # enable justifications to explain the rating
justification_depth="comprehensive", # provide detailed reasoning
justification_max_sents=5, # allow up to 5 sentences for justification
add_references=True, # include references to source text
reference_depth="sentences", # reference specific sentences rather than paragraphs
)
# Attach the concept to the document
doc.add_concepts([usability_rating_concept])
# Configure DocumentLLM with your API parameters
llm = DocumentLLM(
model="azure/gpt-4.1",
api_key=os.getenv("CONTEXTGEM_AZURE_OPENAI_API_KEY"),
api_version=os.getenv("CONTEXTGEM_AZURE_OPENAI_API_VERSION"),
api_base=os.getenv("CONTEXTGEM_AZURE_OPENAI_API_BASE"),
)
# Extract the concept
usability_rating_concept = llm.extract_concepts_from_document(doc)[0]
# Print the extracted rating item with justification and references
extracted_item = usability_rating_concept.extracted_items[0]
print(f"Software Usability Rating: {extracted_item.value}/10")
print(f"\nJustification: {extracted_item.justification}")
print("\nSource references:")
for sent in extracted_item.reference_sentences:
print(f"- {sent.raw_text}")
⭐⭐ Multiple Rating Categories#
You can extract multiple rating categories from a document by creating separate rating concepts:
# ContextGem: Multiple RatingConcept Extraction
import os
from contextgem import Document, DocumentLLM, RatingConcept, RatingScale
# Sample document text about a restaurant review with multiple quality aspects to rate
restaurant_review = """
Restaurant Review: Bella Cucina
Atmosphere: The restaurant has a warm, inviting ambiance with soft lighting and comfortable seating. The décor is elegant without being pretentious, and the noise level allows for easy conversation.
Food Quality: The ingredients were fresh and high-quality. The pasta was perfectly cooked al dente, and the sauces were flavorful and well-balanced. The seafood dish had slightly overcooked shrimp, but the fish was excellent.
Service: Our server was knowledgeable about the menu and wine list. Water glasses were kept filled, and plates were cleared promptly. However, there was a noticeable delay between appetizers and main courses.
Value: Portion sizes were generous for the price point. The wine list offers selections at various price points, though markup is slightly higher than average for comparable restaurants in the area.
"""
# Create a Document from the text
doc = Document(raw_text=restaurant_review)
# Define a consistent rating scale to be used across all rating categories
restaurant_rating_scale = RatingScale(start=1, end=5)
# Define multiple rating concepts for different quality aspects of the restaurant
atmosphere_rating = RatingConcept(
name="Atmosphere Rating",
description="Rate the restaurant's atmosphere and ambiance",
rating_scale=restaurant_rating_scale,
)
food_rating = RatingConcept(
name="Food Quality Rating",
description="Rate the quality, preparation, and taste of the food",
rating_scale=restaurant_rating_scale,
)
service_rating = RatingConcept(
name="Service Rating",
description="Rate the efficiency, knowledge, and attentiveness of the service",
rating_scale=restaurant_rating_scale,
)
value_rating = RatingConcept(
name="Value Rating",
description="Rate the value for money considering portion sizes and pricing",
rating_scale=restaurant_rating_scale,
)
# Attach all concepts to the document
doc.add_concepts([atmosphere_rating, food_rating, service_rating, value_rating])
# Configure DocumentLLM with your API parameters
llm = DocumentLLM(
model="azure/gpt-4.1",
api_key=os.getenv("CONTEXTGEM_AZURE_OPENAI_API_KEY"),
api_version=os.getenv("CONTEXTGEM_AZURE_OPENAI_API_VERSION"),
api_base=os.getenv("CONTEXTGEM_AZURE_OPENAI_API_BASE"),
)
# Extract all concepts from the document
extracted_concepts = llm.extract_concepts_from_document(doc)
# Print all ratings
print("Restaurant Ratings (1-5 scale):")
for concept in extracted_concepts:
if concept.extracted_items:
print(f"{concept.name}: {concept.extracted_items[0].value}/5")
# Calculate and print overall average rating
average_rating = sum(
concept.extracted_items[0].value for concept in extracted_concepts
) / len(extracted_concepts)
print(f"\nOverall Rating: {average_rating:.1f}/5")
📊 Extracted Items#
When a RatingConcept
is extracted, it is populated with a list of extracted items accessible through the .extracted_items
property. Each item is an instance of the _IntegerItem
class with the following attributes:
Attribute |
Type |
Description |
---|---|---|
|
int |
The extracted rating value as an integer within the defined rating scale |
|
str |
Explanation of why this rating was extracted (only if |
|
list[ |
List of paragraph objects that influenced the rating determination (only if |
|
list[ |
List of sentence objects that influenced the rating determination (only if |
💡 Best Practices#
Create descriptive names for your rating concepts that clearly indicate what aspect is being evaluated (e.g., “Product Usability Rating” rather than just “Rating”).
Enhance extraction quality by including clear definitions of what each point on the scale represents in your concept description (e.g., “1 = poor, 3 = average, 5 = excellent”).
Provide specific evaluation criteria in your concept description to guide the LLM’s assessment process. For example, when rating software usability, specify that factors like interface intuitiveness, learning curve, and navigation efficiency should be considered.
Enable justifications (using
add_justifications=True
) when you need to understand the reasoning behind a rating, which is particularly valuable for evaluations that involve complex criteria where the rationale may not be immediately obvious from the score alone.Enable references (using
add_references=True
) to trace ratings back to specific evidence in the document that informed the evaluation.Apply
singular_occurrence=True
for concepts that should yield a single comprehensive rating (like an overall product score) rather than multiple ratings throughout the document.