In terms of RR, the present matters much more than the past. A user should be able to quickly increase their RR by playing a few games well. If they CD, though, their RR should dramatically drop. A weighting function with exponential decay should handle this well.
In terms of what the user says, a percentage is not good. What does 90% mean? It certainly doesn't mean they will CD in 10% of their games. All it means is that they are to some unknown degree more reliable than anyone with a lower RR. A percentage that doesn't actually measure a percentage is odd.
I would propose one of the following:
1) A simple labeling system (ie Very Unreliable, Unreliable, Reliable, Very Reliable) that corresponds to whatever back-end algorithm you choose. This has the advantage of being very clear in meaning. There is the potential downside of locking players in low RR hell (seriously, will anyone play with someone labeled Very Unreliable?) but if we have a decent exponential decaying algorithm, they should only be trapped for 1 or 2 games and they will quickly learn to CD again.
2) Using Percentile instead of Percentages. No one really cares how reliable someone is, they really care how reliable they are *compared to other players*. Percentiles is a natural solution to this. Someone is in the 80th Percentile? OK, that's probably pretty good. 20 Percentile? Better watch out. The only issue with this (besides people needing to understand percentiles) is that if everyone is super reliable (or unreliable) the percentiles might get a bit out of whack.