Average Rice Purity Score — Methodology (Privacy‑First Placeholder)
Purpose of this page: Explain how we will discuss “average” or “typical” Rice Purity results without storing anyone’s personal answers. Until we meet our safety thresholds, we will show methodology and example placeholders only.
Why “average” needs extra care
“Average Rice Purity Score” is a popular query, but the phrase can be misleading. Any number depends on who took the quiz (audience), when (time window), which version (wording and SFW defaults), and how the site computes scores (flat vs. weighted). On top of that, we are a privacy‑first site: we do not upload or store your individual selections. That means we must use careful, aggregate‑only techniques and minimum sample sizes to publish group summaries.
Our goal is to inform readers with distribution ranges, not to rank people or encourage oversharing.
What we will (and won’t) publish
We will publish:
- Range distributions (e.g., 90–100, 80–89, 70–79, 60–69, 0–59) as counts or percentages.
- Time‑windowed summaries (e.g., “last 30 days,” once the window passes minimum sample thresholds).
- Variant splits where appropriate (classic vs. SFW), but never subgroups small enough to re‑identify individuals.
We will not publish:
- Raw per‑question answers, user IDs, IP addresses, device fingerprints, emails, or usernames.
- Single‑user scores or micro‑segments that could re‑identify people (e.g., tiny cohorts).
- Longitudinal tracking of a specific person across sessions.
Need qualitative context? Review the score meaning guide or head back to the interactive test when you are ready to add more anonymous responses.
The aggregation model (high level)
-
Room‑level (opt‑in) aggregation — For friends or classroom “challenges,” participants submit only a score range bin (e.g., “70–79”) into a temporary room. The room shows a bar chart of ranges. No nicknames, no individual scores or answers are stored. When the room expires (e.g., 24–48h), the counts are deleted.
-
Site‑wide aggregation — If enabled and opt‑in, we count range bins only for a moving time window (e.g., rolling 30 days). We never store full scores or answers—just how many participants fell into each range. If a cohort or window is too small, we display “insufficient data.”
-
Differential privacy (optional, when needed) — For borderline cases, we may add tiny random noise to each bin count before publishing. This protects individuals while preserving the shape of the distribution. If used, we will disclose the noise scale.
Score bins (public and stable)
We pre‑define the following public bins to keep the method stable and auditable:
- 90–100 (Cautious range)
- 80–89 (Moderate range)
- 70–79 (Broad range)
- 60–69 (Adventurous range)
- 0–59 (Extensive range)
These bins match our Score Meaning page for consistent interpretation.
Minimum sample thresholds (to avoid false precision)
We will publish a distribution only when all of the following are true:
- Total N ≥ 500 participants in the time window (site‑wide) or Total N ≥ 10 (room‑level).
- Per‑bin N ≥ 10 after any optional noise is applied; bins below 10 will be merged upward (e.g., “≤69”) or shown as “<10 (with noise).”
- Time window: at least 7 full days of data; fast spikes will be rolled into the next cycle.
If thresholds aren’t met, we show a neutral placeholder and a note: “We’ll post updated distributions once we have enough anonymous participation.”
Optional differential privacy (DP) notes
If we add DP noise, we will use a simple Laplace or Geometric mechanism per bin with a small epsilon (privacy budget) appropriate for casual, non‑sensitive stats. We will disclose:
- the fact that DP noise is enabled,
- the epsilon (e.g., 1–2), and
- the expected magnitude (±1–2 counts at typical scales).
DP is not mandatory for our use case, but it is a useful safeguard when publishing small or fresh cohorts.
Bot and duplicate mitigation
Because the quiz is anonymous by design, we adopt coarse defenses to avoid skew:
- Rate‑limits: throttle rapid submissions from the same network or device.
- One‑bin per device per room: in room‑level charts, each device can submit at most one bin per room.
- Heuristics: exclude improbable patterns (e.g., hundreds of submissions within seconds).
- No fingerprints: we avoid invasive identification; we prefer conservative thresholds and transparency.
These mitigations keep privacy intact while reducing obvious noise.
Example (placeholder) visualization
Until we meet thresholds, you may see a placeholder chart with example proportions based on historical anecdotes and public conversations (not our data). It will be clearly labeled as a mock example. Once we have sufficient anonymous participation, the chart will automatically switch to real, aggregated bins.
Interpreting distributions responsibly
- No universal “average.” Distributions vary by audience, time, and wording.
- Ranges over rankings. A broad distribution is more informative than a single mean.
- Context matters. SFW default, campus cycles, and cultural shifts all affect results.
- Avoid stigma. Numbers describe participation; they don’t assign value to people.
We present ranges to help readers understand typical outcomes without framing any score as better or worse.
Transparency checklist (what you can audit)
- Public bins and public thresholds (listed on this page).
- Clear opt‑in for any site‑wide counting (off by default).
- No per‑question storage—ever.
- No usernames, emails, or device IDs.
- Deletion policy for room‑level data (automatic expiry).
- DP disclosure if noise is applied.
If we miss or change any of the above, we will update the page with a changelog.
Future API sketch (for rooms)
When we add rooms, the API will work with range bins only:
POST /api/room→{ roomId, expireAt }(create a room)POST /api/room/:id/submit→ Body{ range: "90-100"|"80-89"|"70-79"|"60-69"|"0-59" }GET /api/room/:id/stats→{ "90-100": 3, "80-89": 5, "70-79": 2, "60-69": 1, "0-59": 0 }- Expiry: 24–48 hours; after expiry, counts are deleted.
No nicknames, no free‑text, no individual scores.
Frequently asked questions
Q: Why don’t you show the exact mean or median?
A: A single number suggests more precision than our privacy model supports. Ranges communicate patterns while protecting individuals.
Q: Can my class or club get a custom report?
A: Use a room. It shows a simple range chart for your group, expires automatically, and never stores identities.
Q: Will you ever collect my answers?
A: No. Our design avoids collecting per‑question answers. If we offer opt‑in site‑wide counts, they will be range bins only.
Q: How do you prevent spam?
A: Basic rate‑limits and per‑device room rules, without fingerprinting. When in doubt, we err on the side of not publishing small or suspicious cohorts.
Markup and schema (when live)
Once real data is available, we may add a lightweight Dataset schema describing the distribution (time window, bin definitions, and thresholds). We will not publish raw records.
Final note
We want readers to learn something useful about typical outcomes without anyone sacrificing privacy. That’s why we publish bins, not answers; windows, not traces; and thresholds, not hype. If and when our anonymous participation is sufficient, you’ll see the real distribution here—until then, this methodology remains our guide.