mirror of
https://github.com/LearningCircuit/local-deep-research.git
synced 2026-06-16 12:02:34 +03:00
docs(journal-quality): clarify score scale is non-contiguous
The docs and settings description previously advertised a "1-10 scale"
and referenced score 3 ("Unknown") in the threshold table, but the
code only emits {1, 4, 5, 6, 7, 8, 10}. Values 2, 3, and 9 are never
assigned (the default/unknown case emits 4, not 3).
- Fix the opening scale claim to note the non-contiguous emission.
- Replace the "Score 3 = Unknown" row with "Score 4 = Default" so the
table matches constants.py (JOURNAL_QUALITY_DEFAULT=4).
- Correct the threshold table: thresholds 3 and 4 now behave the same
as 2 (since 2 and 3 aren't emitted scores), and raising to 5 is
what starts dropping default/unknown venues.
- Update default_settings.json description and regenerate golden
master to match.
This commit is contained in:
@@ -4,30 +4,30 @@ The journal quality system automatically scores academic journals encountered du
|
||||
|
||||
## Overview
|
||||
|
||||
When you search using academic engines (ArXiv, OpenAlex, Semantic Scholar, NASA ADS), every journal is automatically scored on a 1-10 scale using real bibliometric data. Predatory journals are auto-removed from results. Scores are cached so subsequent lookups are instant.
|
||||
When you search using academic engines (ArXiv, OpenAlex, Semantic Scholar, NASA ADS), every journal is automatically scored on a 1–10 scale using real bibliometric data. The emitted scores are non-contiguous — the system only produces `{1, 4, 5, 6, 7, 8, 10}`; values 2, 3, and 9 are reserved but never assigned. Predatory journals are auto-removed from results. Scores are cached so subsequent lookups are instant.
|
||||
|
||||
## Quality Scale
|
||||
|
||||
| Score | Tier | Description | Example |
|
||||
|-------|------|-------------|---------|
|
||||
| 9-10 | Elite | Top-tier journals with h-index ≥ 150 | Nature, Science, NEJM |
|
||||
| 7-8 | Strong | Strong Q1 journals, h-index 40-149 | PLOS ONE, IEEE Trans., DOAJ Seal |
|
||||
| 5-6 | Moderate | Solid journals, DOAJ-listed OA journals | Many field-specific journals |
|
||||
| 4 | Low | Low h-index but indexed in OpenAlex | Newer or niche journals |
|
||||
| 3 | Unknown | No data in any bundled source | Repositories, garbled refs |
|
||||
| 10 | Elite | Top-tier journals with h-index ≥ 150 | Nature, Science, NEJM |
|
||||
| 7–8 | Strong | Strong Q1 journals, h-index 40–149, or DOAJ Seal | PLOS ONE, IEEE Trans. |
|
||||
| 5–6 | Moderate | Solid journals, DOAJ-listed OA journals | Many field-specific journals |
|
||||
| 4 | Default | Low h-index or unknown venue (no data in any bundled source) | Newer, niche, or unindexed journals |
|
||||
| 1 | Predatory | Flagged by Stop Predatory Journals — auto-removed | SPJ list entries |
|
||||
|
||||
## Threshold Semantics
|
||||
|
||||
The threshold setting controls how aggressively the filter drops results:
|
||||
The threshold setting controls how aggressively the filter drops results (the filter rejects any journal whose score is below the threshold):
|
||||
|
||||
| Threshold | Effect |
|
||||
|-----------|--------|
|
||||
| **2 (default)** | Drop only predatory (score 1). Keep everything else, including unknowns. |
|
||||
| 3 | Same as 2 — predatory only (no scores fall in this gap). |
|
||||
| 4 | Also drop unknown / low-confidence venues (score 3). |
|
||||
| 5 | Also drop the long tail of legit-but-low-h-index journals (score 4). |
|
||||
| 6+ | Keep only Q1/strong journals. Aggressive — use only when you specifically want elite-only filtering. |
|
||||
| **2 (default)** | Drop only predatory (score 1). Keep everything else — including default/unknown venues. |
|
||||
| 3 | Same as 2 — no scores fall in the 2–3 gap. |
|
||||
| 4 | Same as 2 — no scores fall in the 2–3 gap. |
|
||||
| 5 | Also drop default/unknown venues (score 4). |
|
||||
| 6 | Also drop the long tail of moderate journals (score 5). |
|
||||
| 7+ | Keep only strong/elite journals. Aggressive — use only when you specifically want high-quality filtering. |
|
||||
|
||||
The default of **2** is intentionally conservative: it removes flagged predatory venues (we have positive evidence of fraud) but doesn't silently delete sources just because we don't have bibliometric data on them.
|
||||
|
||||
|
||||
@@ -1166,7 +1166,7 @@
|
||||
},
|
||||
"search.journal_reputation.threshold": {
|
||||
"category": "journal_quality_filter_parameters",
|
||||
"description": "Minimum journal quality score (1-10 scale) required for academic results. Sources from journals below this threshold are excluded when the filter is enabled. Default 2 = drop only predatory journals (score 1); raise to 4 to also drop unknown/low-confidence venues; raise to 6+ to keep only Q1-strong journals.",
|
||||
"description": "Minimum journal quality score required for academic results. Sources below this threshold are excluded when the filter is enabled. Scores emitted: 1 predatory, 4 default/unknown, 5-6 moderate, 7-8 strong, 10 elite (values 2, 3, 9 are reserved but never assigned). Default 2 = drop only predatory; raise to 5 to also drop default/unknown venues; raise to 7+ to keep only strong/elite journals.",
|
||||
"editable": true,
|
||||
"max_value": 10,
|
||||
"min_value": 1,
|
||||
|
||||
@@ -7335,7 +7335,7 @@
|
||||
},
|
||||
"search.journal_reputation.threshold": {
|
||||
"category": "journal_quality_filter_parameters",
|
||||
"description": "Minimum journal quality score (1-10 scale) required for academic results. Sources from journals below this threshold are excluded when the filter is enabled. Default 2 = drop only predatory journals (score 1); raise to 4 to also drop unknown/low-confidence venues; raise to 6+ to keep only Q1-strong journals.",
|
||||
"description": "Minimum journal quality score required for academic results. Sources below this threshold are excluded when the filter is enabled. Scores emitted: 1 predatory, 4 default/unknown, 5-6 moderate, 7-8 strong, 10 elite (values 2, 3, 9 are reserved but never assigned). Default 2 = drop only predatory; raise to 5 to also drop default/unknown venues; raise to 7+ to keep only strong/elite journals.",
|
||||
"editable": true,
|
||||
"max_value": 10,
|
||||
"min_value": 1,
|
||||
|
||||
Reference in New Issue
Block a user