spec:
  id: a8674536-552e-4289-b62b-90918608e652
  name: wikipedia_25k_to_50k_tokens_docs spec claude
  description: ''
  datasetId: 399ae716-7641-4d3a-afff-c52a9085e1fa
  datasetName: wikipedia_25k_to_50k_tokens_docs
  requirements: The data point must be a comprehensive, long-form Wikipedia-style article on a medical or healthcare topic. The article must be structured with multiple hierarchical sections and subsections, each with clear headings. The content must be factual, informative, and written in an objective, encyclopedic tone. The article must contain specific references to dates, statistics, people, places, and/or institutions. The article must include various formatting elements such as paragraphs of varying lengths, bullet points or lists in some sections, and descriptive text that explains complex concepts. The article must be detailed enough to cover the topic thoroughly, including historical context, key developments, important figures, institutional structures, and current state. The article must maintain a neutral point of view while presenting comprehensive information.
  spec_version: '1'
  data_property_variations:
    - base_distributions:
        Chronological progression from ancient to modern times: 20
        Comparative structure across multiple entities: 5
        Geographic organization by regions or countries: 15
        Mixed chronological-thematic structure: 10
        Problem-solution organizational framework: 5
        Systematic organization by components of a system: 20
        Thematic organization by major topic areas: 25
      conditional_distributions: {}
      property_name: Primary organizational structure
      property_values:
        - Chronological progression from ancient to modern times
        - Thematic organization by major topic areas
        - Geographic organization by regions or countries
        - Systematic organization by components of a system
        - Mixed chronological-thematic structure
        - Problem-solution organizational framework
        - Comparative structure across multiple entities
    - base_distributions:
        Concentrates on contemporary period with minimal history: 15
        Covers a specific historical period in depth: 15
        Covers several centuries with detailed historical progression: 20
        Emphasizes recent decades with ongoing developments: 10
        Focuses on modern era with brief historical background: 25
        Spans thousands of years from prehistoric to contemporary: 15
      conditional_distributions: {}
      property_name: Temporal scope
      property_values:
        - Spans thousands of years from prehistoric to contemporary
        - Covers several centuries with detailed historical progression
        - Focuses on modern era with brief historical background
        - Concentrates on contemporary period with minimal history
        - Covers a specific historical period in depth
        - Emphasizes recent decades with ongoing developments
    - base_distributions:
        Comparative analysis across multiple countries: 15
        Focus on specific cultural or ethnic groups: 5
        Global coverage spanning multiple continents and civilizations: 25
        Regional focus on a specific continent or area: 15
        Single country or nation focus: 30
        Western-centric with some international coverage: 10
      conditional_distributions: {}
      property_name: Geographical focus
      property_values:
        - Global coverage spanning multiple continents and civilizations
        - Single country or nation focus
        - Regional focus on a specific continent or area
        - Comparative analysis across multiple countries
        - Western-centric with some international coverage
        - Focus on specific cultural or ethnic groups
    - base_distributions:
        Broad overview of entire medical field or system: 20
        Healthcare policy and system structure: 20
        Medical ethics and professional practice: 8
        Medical technology or methodology: 10
        Particular disease, condition, or health issue: 10
        Pharmaceutical and treatment modalities: 7
        Public health and epidemiology: 10
        Specific medical specialty or subspecialty: 15
      conditional_distributions: {}
      property_name: Subject matter specificity
      property_values:
        - Broad overview of entire medical field or system
        - Specific medical specialty or subspecialty
        - Particular disease, condition, or health issue
        - Healthcare policy and system structure
        - Medical technology or methodology
        - Public health and epidemiology
        - Medical ethics and professional practice
        - Pharmaceutical and treatment modalities
    - base_distributions:
        Balanced presentation of criticism alongside achievements: 10
        Heavy emphasis on ongoing debates and competing viewpoints: 10
        Minimal controversy, primarily historical or descriptive: 25
        Moderate inclusion of debates and criticisms: 30
        Substantial dedicated sections on controversies and challenges: 25
      conditional_distributions: {}
      property_name: Emphasis on controversy and debate
      property_values:
        - Minimal controversy, primarily historical or descriptive
        - Moderate inclusion of debates and criticisms
        - Substantial dedicated sections on controversies and challenges
        - Heavy emphasis on ongoing debates and competing viewpoints
        - Balanced presentation of criticism alongside achievements
    - base_distributions:
        Heavy use of statistics, percentages, and numerical data throughout: 20
        Mix of quantitative and qualitative information: 20
        Moderate statistical data integrated into narrative: 25
        Primarily qualitative with occasional quantitative examples: 15
        Sparse statistics, primarily dates and historical figures: 20
      conditional_distributions: {}
      property_name: Density of quantitative data
      property_values:
        - Heavy use of statistics, percentages, and numerical data throughout
        - Moderate statistical data integrated into narrative
        - Sparse statistics, primarily dates and historical figures
        - Mix of quantitative and qualitative information
        - Primarily qualitative with occasional quantitative examples
    - base_distributions:
        Brief mention of disparities in specific contexts: 20
        Extensive discussion of inequalities and access issues: 25
        Minimal focus on inequality issues: 15
        Moderate coverage of demographic and socioeconomic factors: 30
        No significant coverage of disparities: 10
      conditional_distributions: {}
      property_name: Coverage of socioeconomic disparities
      property_values:
        - Extensive discussion of inequalities and access issues
        - Moderate coverage of demographic and socioeconomic factors
        - Brief mention of disparities in specific contexts
        - Minimal focus on inequality issues
        - No significant coverage of disparities
    - base_distributions:
        Brief mentions of economic aspects: 20
        Extensive analysis of costs, spending, and economic models: 25
        Focus on philanthropic and charitable funding models: 5
        Minimal economic or financial content: 20
        Moderate discussion of funding structures and expenditures: 30
      conditional_distributions: {}
      property_name: Depth of coverage on funding and economics
      property_values:
        - Extensive analysis of costs, spending, and economic models
        - Moderate discussion of funding structures and expenditures
        - Brief mentions of economic aspects
        - Minimal economic or financial content
        - Focus on philanthropic and charitable funding models
    - base_distributions:
        Dedicated sections comparing multiple approaches: 20
        Extensive comparisons with other countries or systems: 20
        Minimal comparative content: 20
        No comparative analysis: 15
        Occasional comparative references: 25
      conditional_distributions: {}
      property_name: Inclusion of comparative analysis
      property_values:
        - Extensive comparisons with other countries or systems
        - Dedicated sections comparing multiple approaches
        - Occasional comparative references
        - Minimal comparative content
        - No comparative analysis
    - base_distributions:
        Brief mentions of key people in context: 25
        Heavy focus on biographical information and contributions of individuals: 25
        Minimal focus on individuals, more on institutions and systems: 15
        Moderate inclusion of important figures and their work: 30
        No significant emphasis on specific people: 5
      conditional_distributions: {}
      property_name: Emphasis on key historical figures
      property_values:
        - Heavy focus on biographical information and contributions of individuals
        - Moderate inclusion of important figures and their work
        - Brief mentions of key people in context
        - Minimal focus on individuals, more on institutions and systems
        - No significant emphasis on specific people
    - base_distributions:
        Brief mentions with critical scientific perspective: 25
        Minimal coverage of non-mainstream approaches: 20
        Moderate discussion of historical and alternative practices: 25
        No significant discussion of alternative medicine: 10
        Substantial coverage of traditional medical systems alongside modern medicine: 20
      conditional_distributions: {}
      property_name: Treatment of traditional and alternative medicine
      property_values:
        - Substantial coverage of traditional medical systems alongside modern medicine
        - Moderate discussion of historical and alternative practices
        - Brief mentions with critical scientific perspective
        - Minimal coverage of non-mainstream approaches
        - No significant discussion of alternative medicine
    - base_distributions:
        Brief mentions of current state: 20
        Dedicated sections on present-day issues and ongoing crises: 25
        Major focus on current problems and future outlook: 20
        Minimal contemporary content, primarily historical: 10
        Moderate discussion of contemporary challenges: 25
      conditional_distributions: {}
      property_name: Inclusion of contemporary challenges
      property_values:
        - Major focus on current problems and future outlook
        - Dedicated sections on present-day issues and ongoing crises
        - Moderate discussion of contemporary challenges
        - Brief mentions of current state
        - Minimal contemporary content, primarily historical
    - base_distributions:
        Brief mentions of key technological developments: 25
        Extensive coverage of medical technology evolution and innovation: 25
        Heavy emphasis on digital health and modern tech: 5
        Minimal focus on technology: 15
        Moderate discussion of tools, equipment, and technological progress: 30
      conditional_distributions: {}
      property_name: Discussion of technological advancement
      property_values:
        - Extensive coverage of medical technology evolution and innovation
        - Moderate discussion of tools, equipment, and technological progress
        - Brief mentions of key technological developments
        - Minimal focus on technology
        - Heavy emphasis on digital health and modern tech
    - base_distributions:
        Broad overview without deep specialty coverage: 10
        Minimal subdivision into specialized topics: 10
        Moderate coverage of subspecialties: 20
        Multiple detailed subsections on specialized areas: 30
        Several dedicated sections for different specialties or aspects: 30
      conditional_distributions: {}
      property_name: Inclusion of specialized subtopics
      property_values:
        - Multiple detailed subsections on specialized areas
        - Several dedicated sections for different specialties or aspects
        - Moderate coverage of subspecialties
        - Minimal subdivision into specialized topics
        - Broad overview without deep specialty coverage
    - base_distributions:
        Brief mentions of ethical issues in context: 25
        Minimal ethical content: 20
        Moderate coverage of ethical considerations: 25
        No significant ethical discussion: 10
        Substantial discussion of ethics, morality, and professional conduct: 20
      conditional_distributions: {}
      property_name: Treatment of ethical and moral dimensions
      property_values:
        - Substantial discussion of ethics, morality, and professional conduct
        - Moderate coverage of ethical considerations
        - Brief mentions of ethical issues in context
        - Minimal ethical content
        - No significant ethical discussion
  selected_sql_schema_column: null
  selected_sql_query_columns: []
  createdAt: '2026-01-09'
  updatedAt: '2026-01-09'
