assessment
The goal Proficiency-Based Learning Simplified is to ensure that students acquire the most essential knowledge and skills they will need to succeed in school, higher education, the modern workplace, and adult life. Therefore, systems of assessment and verifying proficiency should prioritize enduring knowledge and skills—i.e., graduation standards and related performance indicators.

In this section, district and school leaders will find guidance on what to assess, how to assess it, and how to verify and report student proficiency in relation to standards.

Verifying Proficiency: Graduation Standards

Verifying achievement of graduation standards—the learning expectations students must achieve to be eligible for grade promotion or a diploma—should be based on a student’s achievement of performance indicators over time. The achievement of graduation standards requires students to develop a strong knowledge base and sophisticated conceptual understanding. Performance indicators describe, in more fine-grained detail, the specific knowledge and skills that students must acquire to demonstrate they have met a graduation standard—in effect, performance indicators break down comprehensive graduation standards into their component parts.

The following examples, taken from our exemplar graduation standards for English language arts and mathematics, will help to illustrate the relationship between graduation standards and performance indicators:

Sample Graduation Standard: English Language Arts
Conduct research projects based on focused questions, demonstrating understanding of the subject.

Performance Indicators

  • Collect relevant information from multiple print and digital sources.
  • Integrate accurate information into the text selectively and purposefully to maintain the flow of ideas.
  • Follow a standard citation format, avoiding plagiarism and overreliance on any one source.
  • Draw evidence from literary or informational texts to support analysis, reflection and research.

Sample Graduation Standard: Mathematics
Reason and model quantitatively, using units and number systems to solve problems.

Performance Indicators

  • Extend the properties of exponents to rational exponents.
  • Use the properties of rational and irrational numbers.
  • Reason quantitatively and use units to solve problems.
  • Perform arithmetic operations with complex numbers.
  • Use complex numbers in polynomial identities and equations.

This document describes two primary ways that schools and educators can verify a student’s achievement of graduation standards.

Verification Methods

Using aggregate scores on performance indicators, districts and schools can verify the achievement of graduation standards in two primary ways: Body-of-Evidence Verification or Mathematical Verification.

  1. Body-of-Evidence Verification: Determining proficiency using a body of evidence requires a review and evaluation of student work and assessment scores. The review and evaluation process may vary in both format and intensity, but verifying proficiency requires that educators use common criteria to evaluate student performance consistently from work sample to work sample or assessment to assessment. For example, teachers working independently may use agreed-upon criteria to evaluate student work, a team of educators may review a student portfolio using a common rubric, or a student may demonstrate proficiency through an exhibition of learning that is evaluated by a review committee using the same consistently applied criteria.
  2. Mathematical Verification: Determining proficiency using mathematical verification requires teachers to use a common formula that aggregates assessment results on performance indicators over time to determine the achievement of a graduation standard.

Approach

Pros

Cons

Body-of-Evidence Verification

  • Encourages students and educators to reflect on and assess learning progress and work quality.
  • Emphasizes the evaluation of a body of work that has been collected over time.
  • Encourages students to take greater ownership over the learning process.
  • Allows for evidence from outside-of-school learning pathways, such as internships or dual-enrollment courses.
  • Can be used to involve parents and community members in the learning process, such as through a public exhibition of learning.
  • Can be a time-consuming process for both students and teachers.
  • May be perceived as a disconnected, after-the-fact event rather than an integral part of the learning and assessment process.
  • May require schools to communicate student achievement differently than they have in the past, which may be unfamiliar or confusing to some parents and families.
  • Requires teachers, reviewers, and scorers to use common evaluation criteria and processes, which can require training and practice to calibrate.

Mathematical Verification

  • Results are relatively straightforward and easy to calculate.
  • Utilizes scores on student work that has already been assessed.
  • Communication and understanding of student progress can be done in more traditional and familiar ways.
  • Existing student-information systems often use mathematical calculations to report student learning.
  • Learning progress can be obscured when calculating a series of scores rather than evaluating learning growth over time.
  • May allow for less student voice and choice than a body-of-evidence approach.
  • May inadvertently limit flexibility and creativity when it comes to instruction and assessment.
  • May encourage students to narrowly focus on grades and numerical indicators of success, rather on their learning progress and skill development.

Body-of-Evidence Verification

Determining proficiency using a body-of-evidence process requires students to gather work samples and other evidence of academic accomplishment, present the evidence to educators, and have it scored against a set of common criteria defined in a rubric or scoring guide.

There are two primary approaches to body-of-evidence verification that schools typically use:

Approach

Process

Portfolios

Students collect work samples and other evidence of learning from courses and academic experiences that teachers or review committees assess using common criteria at the end of a defined instructional period, such as a term or school year.

Exhibitions

Students work toward a culminating demonstration of learning that teachers or review committees assess using common criteria at the end of a defined instructional period, such as a term or school year.

Portfolios and exhibitions typically address a wide range of content-area and cross-curricular standards, including critical thinking and problem solving, reading and writing proficiency, or habits of work and character traits (e.g., teamwork, preparedness, responsibility, or persistence). In course-based portfolio and exhibition assessments, individual teachers use common, agreed-upon criteria to evaluate a body of work that students have completed over the course of an instructional period. For cross-curricular portfolios and exhibitions, groups of content-area teachers or review committees evaluate the work. It should be noted that portfolios do not require students to create new work, but to collect and present past work, evidence, and accomplishments—although exhibitions can incorporate examples of past work as well.

In many schools, end-of-term portfolios and exhibitions are also used as a way to introduce greater creativity and flexibility into the assessment process. For example, students may incorporate work samples and evidence from outside-of-school learning experiences, such as internships, dual-enrollment courses, vacation-break programs, or self-directed projects. The approach may also allow for greater instructional flexibility because teachers will be less focused on generating a certain number of scores, using certain types of assessments, over the course of an instructional period.

To use these methods effectively, schools need to invest time and resources in their body-of-evidence assessment system. For example, teachers are often trained in portfolio evaluation and consistent scoring; students are given time and support to create their portfolios; students and their parents are informed about the criteria and how the evidence will be evaluated; and the schools give teachers and review committees time during the regular school day to evaluate the portfolios.

Mathematical Verification

Mathematical verification can be computed in three primary ways:

Approach

Process

Formula

Performance-indicator scores are calculated using a common mathematical formula, such as an average, to determine a student’s proficiency level on each graduation standard.

Majority

Students are required to demonstrate achievement of a majority of performance indicators to meet a graduation standard.

Totality

Students are required to demonstrate achievement of all performance indicators to meet a graduation standard.

The following table illustrates how the three mathematical approaches may be used to determine whether a student has met a graduation standard. In this example, we use a 4.0 scale in which a score of 3.0 meets the standard:

Mathematics Graduation Standard: Number and Quantity

Reason and model quantitatively, using units and number systems to solve problems

Performance Indicator

Average

Majority

Totality

Extend the properties of exponents to rational exponents 3.5 3.5 3.5
Use the properties of rational and irrational numbers 3.0 3.0 3.0
Reason quantitatively and use units to solve problems 3.5 3.5 3.5
Perform arithmetic operations with complex numbers 3.0 3.0 3.0
Use complex numbers in polynomial identities and equations 2.0 2.0 2.0
Meets Graduation Standard YES: The average (3.0) meets the proficiency benchmark YES: Four of five performance indicators were achieved NO: Not all performance indicators were achieved

→ Download Verifying Proficiency: Graduation Standards (.pdf)

Verifying Proficiency: Performance Indicators

“We know that students will rarely perform at high levels on challenging learning tasks at their first attempt. Deep understanding or high levels of proficiency are achieved only as a result of trial, practice, adjustments based on feedback, and more practice.”
—Jay McTighe, “What Happens Between Assessments?,” Educational Leadership

In a proficiency-based system, assessment, grading, and reporting practices are designed to (1) accurately measure and describe the knowledge and skills students have acquired, and (2) emphasize and encourage learning growth over time. For these reasons, the achievement of specific learning standards is tracked and reported by teachers, which requires that grade books be reformatted to report assessment results by standard, rather than report results by test or assignment. In Proficiency-Based Learning Simplified, we call the standards for a course or learning experience performance indicators to distinguish them from graduation standards—the standards students must demonstrate to be eligible for grade promotion or a diploma.

Verifying achievement of performance indicators is derived from a student’s performance on assessments over time. The achievement of performance indicators requires students to demonstrate that they have acquired the knowledge and skills—i.e., the learning objectives or learning targets—addressed in units and lessons, which are reported in course-based assessment scores.

When designing assessments, teachers begin with the performance indicators that a specific assessment is intended to address (a process generally known as “backward design”). If an assessment is intended to measure four performance indicators, for example, teachers create four entries in their grade books—one entry for each performance indicator—and scores on the assessment are reported for each indicator.

To determine the extent to which students have demonstrated achievement of performance indicators, the Great Schools Partnership recommends that scores be calculated in a way that assigns the greatest weight to the most recently assessed student work. In this way, students are not penalized for poor performance earlier in a term when more recent assessments indicate they have met or exceeded expectations.

The three most widely used grading options that assign greater weight to more recent assessment results are Power Law, Decaying Average, and Most Recent Score.

*NOTE: While the Great Schools Partnership recommends the use of the following three methods, both power law and decaying average may require districts and schools to heavily modify existing grading systems or invest in specialized online systems—both of which could have financial implications. While the Great Schools Partnership does not endorse any specific grading platform or product, we have created a guide to selecting online grading and reporting systems that will be useful to districts and schools.

Method

Description

Pros

Cons

Power Law

The power-law formula plots different assessment scores over time and attempts to draw a “best-fit” line that effectively answers the question: What score would the student most likely receive on the performance indicator if she were assessed again? Power law does not penalize students for poor performance at the beginning of a grading period, and it produces scores that more accurately reflect what students know and can do at the end of a semester or year. Because the formula generates a predictive trend, it’s possible that power law could produce, in some cases, a final score that is higher than the highest score earned by a student.

Decaying Average

Decaying-average formulas assign progressively decreasing weight to older assessment scores. In effect, newer assessments “count more” in the final score. Because skills and knowledge increase over time, giving more weight to more recent assessments can facilitate the learning process and encourage teaching practices that are focused on learning growth. Decaying averages introduce the possibility that students may not try as hard on some assessments given earlier in a grading period.

Most Recent Score

Teachers use the most recent assessment score (or scores) to determine if students have achieved performance indicators. Using the most recent assessment score encourages students to improve their performance because new assessment results replace older results, and final grades will more accurately reflect the knowledge and skills they acquired over the course of a term. Some teachers are uncomfortable using systems that replace older scores because they believe that students may not give every assessment their best effort if they know that some grades won’t “count” or that they will be allowed to redo or retake assessments.

How Power Law Works

While the power-law formula is mathematically complex, and requires specialized grading systems, educators only need to know how it works and how to interpret scores (for a detailed explanation of the formula, see Transforming Classroom Grading by Robert J. Marzano).

Power law predicts what the student’s next score will be based on the scores a student has already earned. In effect, power law answers the question: What score would this student most likely receive on the standard if she was assessed again?

In the table below, for example, the teacher is using a four-point rubric to evaluate proficiency on four distinct assessments. Four students in the class earned the same set of scores (1.00, 2.00, 3.00, and 4.00), but each in a different order. If the scores were averaged, all four students would receive a 2.5, but the power-law formula produces different aggregate scores because it generates a trend that places more weight on more recent assessments.

The following chart* provides a simplified illustration of how power law works in practice:

Assessment

1

2

3

4

Final Score

Proficiency Interpretation

Student 1

1.00 2.00 3.00 4.00 4.00 The scores show continuous improvement, and the student will likely demonstrate mastery on the next assessment.

Student 2

1.00 3.00 2.00 4.00 3.66 The scores show irregular improvement, and the student will likely demonstrate high but not complete mastery on the next assessment.

Student 3

2.00 4.00 1.00 3.00 2.16 The scores show very uneven performance, and the student will likely demonstrate a mid-level of achievement on the next assessment.

Student 4

4.00 3.00 2.00 1.00 1.23 The scores show continuous decline, and the student will likely demonstrate a low level of achievement on the next assessment.

*Note: This section was adapted from useful explanations created by EasyGradePro and JumpRope.

How Decaying Average Works

A decaying-average formula gives more weight to more recent assessment scores. Decaying average is based on the assumption that students—with more instruction, support, and practice—will progressively increase their knowledge, comprehension, and skill, while also decreasing the frequency of errors and incorrect answers. The formula is intended to produce scores that more accurately reflect learning progress on performance indicators—i.e., where students end up, rather than where they started out.

One of the benefits of decaying average is that it can be used with as few as two assessment scores. And unlike power law, which uses a complex mathematical algorithm, decaying average is relatively easy to explain to students and parents. Districts and schools can determine the weight used in the formula, but it needs to be at least a 60-percent weight on the most recent assessments to produce reliable scores.

If a teacher is using decaying average with a .65 weight, for example, and a student takes two assessments and earns scores of 2.00 and 3.00 for a performance indicator [.35(2) + .65(3)], the final score would be a 2.65 (or below proficiency). If the student then takes a third assessment and earns a score of 4.00 [.35(2.65) + .65(4)], the recalculated score would be a 3.53 (or above proficiency). Notice how the formula takes the last recorded proficiency level (not the last recorded assessment score) and weights it by .35 to produce the “decaying” average.

*NOTE: There are a variety of ways to calculate decaying average, and online grading systems may offer multiple options. For example, some may offer multiple weight options or allow teachers to assign more weight to certain assessments or types of assessments. For this reason, districts and schools should always review all available options and ask questions to determine whether a specific product or platform will suit a school’s instructional needs and goals.

The following chart* provides a simplified illustration of how decaying average works in practice:

Assessment

1

2

3

4

Final Score

Proficiency Interpretation

Student 1

1.00 2.00 (1.65) 3.00 (3.00) 4.00 3.65 The scores show continuous improvement, and the student’s proficiency level reflects learning progress made during the grading period. The final score indicates the student’s current proficiency level, while also factoring in the student’s less successful demonstrations at a diminished weight.

Student 2

1.00 3.00 (2.30) 2.00 (2.10) 4.00 3.33 The scores show irregular improvement, which suggests that the student may not have understood an important concept or that outside factors may have adversely affected the student’s performance. If a low score is misrepresentative, the student’s proficiency level will quickly go up after scores improve on additional assessments.

Student 3

2.00 4.00 (3.30) 1.00 (1.80) 3.00 2.58 The scores show very uneven performance. While the student demonstrated proficiency on the last assessment, the current score recognizes that the student has not met the standard with enough consistency to be considered proficient at this time.

Student 4

4.00 3.00 (3.35) 2.00 (2.47) 1.00 1.51 The scores show continuous decline. If the student’s scores were averaged, the final score of 2.5 would reflect an inflated proficiency level, given the student’s most recent assessment results. The decaying average more accurately represents the student’s declining assessment results.

*Note: This section was adapted from useful explanations created by EasyGradePro and JumpRope.

How Most Recent Score Works

In some schools, teachers use scores on the most recent assessment (or assessments) to determine proficiency on performance indicators. The method is based on the assumption that a student’s most recent performance is representative of the knowledge and skills he or she has acquired.

When deciding whether to use the most-recent-score method, school leaders and teachers should consider the structure of the curriculum to ensure that the approach will accurately reflect student learning progress and achievement. For example, the method tends to work best with skill-based standards that require students to refine and improve their abilities over time. With some content-based standards that are demonstrated at a specific point in time and only once, the method may produce less accurate results.

When most recent score is used to determine proficiency, students can quickly recover from poor assessment scores that failed to meet expected standards, while students who met standards initially must also maintain their high performance. That said, the method could produce less representative or accurate proficiency levels when scores are uneven.

The following chart provides a simplified illustration of how most recent score works in practice:

Assessment

1

2

3

4

Final Score

Proficiency Interpretation

Student 1

1.00 2.00 3.00 4.00 4.00 The scores show continuous improvement, and the student’s proficiency level reflects that progress.

Student 2

1.00 3.00 2.00 4.00 4.00 The scores show irregular improvement, and the final score may or may not reflect the most accurate proficiency level in some cases.

Student 3

2.00 4.00 1.00 3.00 3.00 The scores show very uneven performance. While the final score meets the standard, the student’s proficiency level may not be entirely clear in some cases.

Student 4

4.00 3.00 2.00 1.00 1.00 The scores show irregular improvement, and the final score may or may not reflect the most accurate proficiency level in some cases.

Alternative Methods

Some schools choose to use alternative methods and formulas in their proficiency-based systems, including Mean, Mode, and Highest Score.

*NOTE: These three options are described here for informational purposes only—the Great Schools Partnership does not recommend the use of these methods.

Method

Description

Pros

Cons

Mean

All assessment scores are averaged together to determine proficiency. This method will be familiar to teachers, students, and parents because it has historically been the most common grading method used in schools. Averaging can distort and misrepresent proficiency, particularly when students make significant progress over the course of a grading term.

Mode

The most common score is used to determine proficiency. Mode is relatively easy to explain to parents and students. If the grading scale used by schools has a lot of graduations, the mode is much more difficult to calculate and may not accurately reflect a student’s proficiency level.

Highest Score

The highest score achieved by a student is used to determine proficiency. This method could encourage students to take risks in their education and explore more challenging learning opportunities after they have demonstrated proficiency. The highest score may not accurately reflect a student’s level of knowledge and skill, especially when performance is inconsistent.

How Mean Works

Most traditional assessment systems are based on the average (or mean) of all grades a student earns—scores are added up and divided by the total number of scores. In some schools, teachers may assign more weight to certain assessments or types of assessments (such as homework scores vs. test scores), or they may decide that a greater percentage of student’s final course grade will be based on certain types of assessments (for example, the score on a final project may count for 25 percent of a student’s final grade).

While averaging successful assessment scores provides a more representative picture of the knowledge and skills students have acquired, averaging all scores can distort and misrepresent student proficiency and learning progress. For this reason, some schools choose to delay the numerical grading of assessments—by using placeholders such as “not met” or “insufficient evidence”—when averaging. In these cases, teachers will provide additional opportunities for students to redo assessments or improve the quality of their work.

In general, the Great Schools Partnership does not recommend the use of averaging to determine the achievement of performance indicators for three primary reasons:

  1. Averaging may not accurately reflect academic effort, learning growth, or end-of-term proficiency. When scores are averaged at the end of a reporting period, the results may penalize students for poor assessment scores at the beginning of a term—even if they worked hard, improved their performance, and ultimately demonstrated proficiency. Even when averages are weighted to distinguish between formative and summative assessments or “major” and “minor” assessments, the results may still provide a distorted representation of achievement and proficiency.
  2. Averaging may introduce a disincentive to improve. If students fail a few assessments at the beginning of a term, these early failures will impose clear mathematical limits on the final grade they can earn. Consequently, students may be less motivated to work hard or overcome past failures because their final grades won’t reflect their effort and learning progress.
  3. Grade averaging advantages students who begin a course prepared and disadvantages those who begin unprepared. Because effort and learning progress may not be accurately represented in averaged grades, students who begin school with more education, skills, resources, or family support have a strong advantage—in terms of their likelihood of earning a good grade—than students who arrive less prepared. And because academic readiness tends to mirror demographic factors such as socioeconomic and minority status, grade averaging also raises concerns about educational equity.

The following chart provides a simplified illustration of how averaging works in practice:

Assessment

1

2

3

4

Final Score

Proficiency Interpretation

Student 1

1.00 2.00 3.00 4.00 2.50 The scores show continuous improvement, but the final score does not reflect the significant learning progress made by the student—instead, it suggests that the student has failed to meet proficiency.

Student 2

1.00 3.00 2.00 4.00 2.50 The scores show irregular improvement, but the final score does not meet proficiency.

Student 3

2.00 4.00 1.00 3.00 2.50 The scores show very uneven performance. While the average score is somewhat representative the student’s proficiency level in this case, the other averaged scores are clearly misrepresentative.

Student 4

4.00 3.00 2.00 1.00 2.50 Even though the scores show continuous decline, the student receives the same final score as Student 1, who made clear and significant improvement.

How Mode Works

The mode is the most common result in a given data set. While the mode is relatively straightforward and easy to explain, many people confuse “mode” with other mathematical terms like mean (average) and median (middle value).

For most performance indicators, teachers will have more than one assessment result to consider, and the most common score achieved by students may be used to determine proficiency in some schools. Yet when teachers have a limited data set (i.e., fewer scores), when they are using grading scales with more gradations (such as 1–100 scales), or when scores are widely discrepant, the mode may produce misrepresentative results. For example, a student who scored a 1.00 on the first three assessments, a 3.00 on the next two assessments, and a 4.00 on the final two assessments would receive a final score of 1.00 even though the majority of the assessment results demonstrated proficiency. In this case, the student’s learning growth over the grading term would also not be reflected in the final score.

The following chart provides a simplified illustration of how the mode works in practice:

Assessment

1

2

3

4

Final Score

Proficiency Interpretation

Student 1

1.00 2.00 2.00 4.00 2.00 The most common score is 2.0.

Student 2

4.00 4.00 4.00 4.00 4.00 The most common score is 4.0.

Student 3

2.00 4.00 1.00 3.00 IE Because there is no “most common score” in this data set, more evidence is needed to verify proficiency (“IE” in this case stands for insufficient evidence).

Student 4

1.00 3.00 4.00 1.00 4.00 The most common score is 4.0.

How Highest Score Works

The highest-score method is easy explained: the highest assessment score achieved during a grading period is the student’s final score for a performance indicator.

While highest score is easy to use, the method will produce misrepresentative results in many cases. For example, if a student scores a 4.00 on one assessment and 1.00 on all other assessments, the highest score (4.00) may not accurately reflect a student’s level of proficiency. In addition, the method does not take into account a student’s learning growth over the grading term. The advantage of highest score is that it recognizes a student’s best work, while the disadvantage is that it may not accurately represent uneven performance.

The following chart provides a simplified illustration of how highest score works in practice, while also revealing the clear disadvantage of the approach:

Assessment

1

2

3

4

Final Score

Proficiency Interpretation

Student 1

1.00 2.00 2.00 4.00 4.00 Because all students achieved a score of 4.00 at some point during the grading period, all students earned a 4.00 even though the performance patterns from student to student are clearly dissimilar and representative of different levels of proficiency.

Student 2

4.00 4.00 4.00 4.00 4.00

Student 3

1.00 4.00 1.00 1.00 4.00

Student 4

4.00 3.00 2.00 1.00 4.00

→ Download Verifying Proficiency: Performance Indicators (.pdf)

Verifying Proficiency: Scoring Criteria

“Clear learning goals help students learn better (Seidel, Rimmele, & Prenzel, 2005). When students understand exactly what they’re supposed to learn and what their work will look like when they learn it, they’re better able to monitor and adjust their work, select effective strategies, and connect current work to prior learning (Black, Harrison, Lee, Marshall, & Wiliam, 2004; Moss, Brookhart, & Long, 2011)…. The important point here is that students should have clear goals. If the teacher is the only one who understands where learning should be headed, students are flying blind. In all the studies we just cited, students were taught the learning goals and criteria for success, and that’s what made the difference.”

—Brookhart & Moss, “Learning targets on parade,” Educational Leadership, October 2014

Overview

In a proficiency-based system, teachers assess student learning progress and academic achievement using common scoring guides that include detailed descriptions—or “scoring criteria”—outlining what students need to know and be able to do to as they work toward, meet, and exceed proficiency on a given learning standard. Scoring criteria help teachers consistently evaluate work products and other evidence of proficiency as students acquire the essential knowledge and skills required for grade promotion and graduation.

Scoring criteria describe, in clear and precise terms, the characteristics of each stage of achievement along a proficiency continuum—from not meeting to exceeding a specific learning standard. Once schools have articulated scoring criteria for each of the learning objectives students are expected to meet, teachers can then assemble rubrics for assessing student work using a selection of appropriate scoring criteria.

Why Scoring Criteria Matter

Scoring criteria improve assessment in several ways:

  • Through collaborative work during common planning time or in professional learning groups, teachers develop a common understanding of what specific learning evidence constitutes not meeting, meeting, and exceeding proficiency.
  • Scoring criteria enable educators to design a variety of assessments to meet unique student learning needs while applying rigorous academic standards—i.e., they balance the need to maintain high expectations for all students with the need for creative instructional approaches that allow students to demonstrate learning in a variety of ways.
  • Evaluations of student proficiency are more consistent across teachers, courses, and learning experiences, and school leaders, teachers, and parents can be more confident in the accuracy, precision, and reliability of assessment results.
  • When articulated clearly and descriptively, scoring criteria, and the resulting assessment scores, provide more detailed information about learning progress and achievement, which helps students understand the specific knowledge and skills they must demonstrate to reach or exceed proficiency, and helps teachers and parents identify specific learning needs, challenges, and strengths for each student.
  • Once scoring criteria are established, teachers can quickly and efficiently assemble reliable rubrics for any given assessment.

How Scoring Criteria Work

The Great Schools Partnership’s approach to proficiency-based learning articulates the following broad categories of learning outcomes for students:

  • Cross-curricular graduation standards that describe the essential cross-disciplinary skills and habits of work that students will need to succeed in adult life.
  • Content-area graduation standards that describe the essential knowledge and skills students need to acquire in each content area.
  • Performance indicators that describe what students need to know and be able to do to meet either cross-curricular or content-area graduation standards.

Because performance indicators are more specific and measureable than graduation standards—and therefore more “assessable” in the classroom—teachers build assessments by combining performance indicators, and associated scoring criteria, within and across content areas and learning experiences.

In short, performance indicators are the learning expectations for any given assessment, and scoring criteria are the descriptions that articulate the continuum of evidence teachers evaluate to determine the extent to which students have achieved those learning objectives.

Design Principles and Best Practices

PRINCIPLE 1
Scoring criteria articulate a continuum of increasingly complex cognitive demand.

Do This
Scoring criteria should articulate an intentional sequencing of increasingly sophisticated and demanding thinking skills that are aligned with the knowledge and skills described in a performance indicator. The language used to describe the cognitive demand at each proficiency level should correspond to an existing taxonomy, and use verbs that are precise and descriptive (e.g., compare, organize, solve, or justify).

Avoid This
Scoring criteria should not expect students to simply produce more work products, apply the same skill in different contexts, or articulate the same level of cognitive demand but apply it to different tasks. The language used to describe the cognitive demands at each proficiency level should not be randomly selected or overly general and vague (e.g., stating that students exceed the standard by going “above and beyond”).

Illustrative Example
Level: Elementary
Content Area: ELA

Performance Indicator

Emerging

Developing

Proficient

Distinguished

Write an opinion on topics or texts supporting a point of view with reasons and information. I can state my opinion in writing about a topic or text. I can explain my opinion in writing about the topic or text. I can support my written opinion with evidence about a topic or text. I can compare and contrast my written opinion with other opinions using evidence from a text.

Best Practices

  • An effective proficiency continuum must clearly delineate different levels of cognitive demand, which requires educators to use verbs that precisely articulate a progression of increasingly sophisticated thinking skills and academic abilities. To find the appropriate verbs, select a research-based taxonomy of thinking skills—the Great Schools Partnership recommends either Bloom’s Revised Taxonomy or Webb’s Depth of Knowledge—and consistently reference the terminology outlined in the taxonomy when creating scoring criteria for each level of the proficiency continuum.
  • Begin by writing the desired learning outcomes for the proficient level in precise, descriptive, student-friendly terms. Using the selected taxonomy, identify the primary thinking skills that align with the performance indicator and use the most relevant and applicable verbs for that specific learning objective. In the example above, a student’s ability to marshal evidence in support of a written opinion is the most essential skill described in the performance indicator—consequently, the proficient description emphasizes the demonstration of this foundational skill.
  • A simple way to begin articulating the different gradations of proficiency is to ask, “What should students be able to do when they are…emerging, developing, etc.?” In the example above, which is based on Bloom’s Revised Taxonomy, the developing level describes a specific cognitive and academic skill (explaining the rationale for an opinion) that is more sophisticated than the emerging description (merely stating an opinion without expressing a supporting rationale).

PRINCIPLE 2
Scoring criteria focus on the quality of student work at each level of performance.

Do This
Scoring criteria should use objective descriptions that articulate an increasingly nuanced progression of concrete evidence (i.e., what students specifically have to demonstrate at each level to show what know and are able to do) that can be consistently measured and evaluated by different individuals across a variety of assessments and work products.

Avoid This
Scoring criteria should not focus on the frequency of student performance (e.g., “I can complete [a task] 3–5 times”) or use frequency descriptors such as never, rarely, or always. Educators should avoid overly subjective descriptions that either cannot be measured or evaluated consistently (e.g., poor, excellent, high quality) or that focus on superficial features (e.g., neat, colorful, visually appealing).

Illustrative Example
Level: High School
Content Area: Math

Performance Indicator

Emerging

Developing

Proficient

Distinguished

Use geometric shapes and their properties to model physical objects. I can identify geometric shapes (e.g., triangles, quadrilaterals, and other polygons). I can describe geometric shapes and their basic properties. I can use geometric shapes to model physical objects. I can evaluate the quality of models representing physical objects.

 Best Practices

  • Scoring criteria should emphasize quality, not quantity or frequency. Rather than describing the number of times a student should demonstrate knowledge or skill acquisition (e.g., “I can sometimes use geometric shapes to model physical objects”), scoring criteria should reflect the required complexity and sophistication of a specific demonstration of knowledge or skill at a given performance level (e.g., “I can use geometric shapes to model physical objects”).
  • Scoring criteria for the proficient and exceeds levels should explicitly address all the core knowledge and skills articulated in the performance indicator. Likewise, scoring criteria for levels below proficient should not include all elements of the performance indicator, given that emerging- or developing-level work—by definition—does not satisfy the learning objectives articulated in a performance indicator.
  • The language used in scoring-criteria descriptions should articulate evidence that can be objectively measured, assessed, and evaluated in a variety of learning contexts (if multiple educators fail to produce reasonably consistent assessments of the same evidence or work products, the scoring criteria are likely too subjective). To improve consistency from assessment to assessment, teams of teachers can simultaneously score the same student work product and then discuss why, specifically, their results either converged or diverged—these opportunities for collaborative professional learning are essential to developing effective scoring criteria, promoting high expectations, and increasing equity in educational outcomes for all students.
  • Similarly, educators should use real-world examples of student work during the process of articulating, calibrating, and refining scoring criteria. When educators sort samples of student work into categories from emerging to exceeds, and then describe the objective features of the work at each level, scoring criteria will be more precise and educators will develop a stronger understanding of the kinds of evidence that match each performance level.

PRINCIPLE 3
Scoring criteria describe what students can do at each level of performance.

Do This
Scoring criteria should be written from the student’s point of view and reflect an asset-based approach to framing performance-level descriptions—i.e., they should focus on what students can do (not what they can’t do). The descriptions should use positive language that focuses on elevating student expectations, fostering continual improvement, and promoting learning growth over time.

Avoid This
Scoring criteria should not use deficit-based descriptions and framing—i.e., statements that articulate undesirable learning outcomes such as “I cannot [do something].” Negative language reinforces unhelpful mindsets, emphasizes learning deficits, and does not articulate an affirmative sequence of performance-improvement benchmarks that students can work to achieve.

Illustrative Example
Level: High School
Content Area: Health

Performance Indicator

Emerging

Developing

Proficient

Distinguished

Formulate a long-term personal-health plan incorporating decision-making and goal-setting strategies. I can list goals I have for my own health. I can explain ways to reach a goal I set for my own health. I can create a plan to meet immediate and long-term health goals. I can evaluate my progress and adapt my plan so that I can continue to positively impact my personal health.

 Best Practices

  • While there are different ways to frame scoring criteria (e.g., the form known as SWBAT begins with “Students will be able to…”), the Great Schools Partnership recommends “I can” statements, such as “I can evaluate​ my progress and adapt ​my plan so that I can continue to positively impact my personal health.” Scoring criteria written from the student point of view helps students embrace the goals as their own personal learning goals (as opposed to viewing the goals as, say, something they only have to do because a teacher told them they have to).
  • By articulating what students can accomplish at each level, asset-based scoring criteria describe the specific evidence students need to demonstrate to reach each level of proficiency. Similarly, teacher feedback should provide specific, actionable steps that students can take to improve the quality of their work and achieve proficiency.
  • Scoring students against a set of asset-based criteria is a practical application of a growth-mindset approach to teaching and assessment—it communicates that students can improve their knowledge and skills with practice and appropriate support. Asset-based scoring criteria help instill the belief in students that proficiency is achieved not because of innate intelligence or talents (a “fixed mindset” belief), but through perseverance in the face of challenges. Supporting students to develop growth-mindset approach to learning has also been shown to increase their motivation and desire to learn and improve.

PRINCIPLE 4
Scoring criteria can be applied to a variety of learning experiences and work products.

Do This
Scoring criteria should be written to assess specific performance indicators, not learning experiences or work products. When scoring criteria are written to be “task neutral,” they can be combined and recombined to assess any task that teachers assign to or create with students.

Avoid This
Scoring criteria should not be written for specific lessons, units, courses, projects, or assignments, and they should not address any required components of a specific learning task (e.g., the interview protocol, bibliography, presentation, etc. for a specific research project).

Illustrative Example
Level: Middle School
Content Area: Social Studies

Performance Indicator

Emerging

Developing

Proficient

Distinguished

Compare the major regions of the Earth and their major physical features and political boundaries using a variety of geographic tools. I can locate the major regions of the Earth and their major physical features and political boundaries. I can describe the major regions of the Earth and their major physical features and political boundaries. I can compare the major regions of the Earth and their major physical features and political boundaries using a variety of geographic tools. I can analyze and evaluate connections among the major physical features and political boundaries of the Earth using a variety of geographic tools.

Best Practices

  • Scoring criteria should mirror the language of the performance indicator, avoiding any reference to the details or requirements of a specific task (e.g., a research project or lab report). For tasks that include certain required elements—such as a minimum number of citations in a research project or labeling the parts of a diagram in a lab report—these requirements should be articulated in the instructions associated with that assessment (not in the scoring criteria).
  • Because scoring criteria evaluate student work against a performance indicator—rather than a specific task—students have the opportunity to demonstrate their knowledge and skills in a variety of ways. In the example above, a student could demonstrate proficiency by explaining the projected environmental impact of a land-use proposal through an oral presentation, a written report, or a multimedia production. When scoring criteria are “task neutral,” they help teachers meet diverse learning needs by consistently applying expectations for student work across diverse learning experiences and work products, including, for instance, individualized education programs (IEPs) or assessment retakes.

→ Download Scoring Criteria Design Guide (.pdf)

Assessment Pathways Simplified

The graphic below provides a simplified overview of the major features of an assessment system to help educators understand how certain elements work together in a school, district, or other system. Specifically, the graphic illustrates (1) five different assessment pathways that schools might use to evaluate student work and proficiency, (2) the degree of instructional flexibility and student choice that attends each option, and (3) the potential compromises and outcomes that result from each pathway, including consistency in results. Pathways 1–4 represent four proficiency-based approaches to assessment, while pathway 5 represents an assessment scenario that will not result in a commonly applied definition of proficiency.

Assessment_Pathways_Simplified_9-18-15

Definitions
The following definitions will help to explain the graphic in greater detail:

Learning Experience refers to any instructional interaction, course, program, or other experience in which student learning takes place, whether it occurs in traditional academic settings (schools, classrooms) or nontraditional settings (outside-of-school locations, natural environments), or whether it includes traditional educational interactions (e.g., students learning from teachers, professors, or support staff) or nontraditional interactions (e.g., students learning through internships, service-learning experiences, or online learning platforms).

Demonstration Task refers to any form of assessment that educators use to determine whether student have met expected learning standards and outcomes. A demonstration task could be a traditional test or quiz; a writing assignment or other work product; a presentation, exhibition, or capstone experience; or a portfolio of work that students compile over time.

Scoring Guide refers to rubrics, scoring criteria, or other tools and guidelines that educators use to evaluate assessments and student work against a set of learning standards. Scoring guides establish the expectations for learning—i.e., the definition of proficiency and the criteria against which it will be measured. In a proficiency-based system, common scoring guides are necessary to ensure greater validity, reliability, and comparability of learning outcomes, particularly when students are pursuing personalized learning pathways.

Student Choice refers to instructional approaches or techniques that are based on student interests, passions, or ambitions, and the degree to which educators give students choices during the educational process. When learning experiences, demonstration tasks, and scoring guides are common, students will inevitably be given less choice during the learning process; when all three are unique, more student choice is possible, but it is also harder for educators to maintain consistent learning expectations, assessment practices, and proficiency determinations across different educational experiences, content areas, or grade levels.

Valid and Reliable refers to the degree to which learning outcomes can be validated—or “certified”—by educators in a school, and to the degree of confidence educators have in results of the assessment process. (*In this case, the term valid and reliable is being used in the sense of “defensible education results based on evidence,” not in the technical psychometric sense used by developers of large-scale assessments and standardized tests). It is also important to note that “valid and reliable” results are only a potential outcome of each assessment pathway option—such results are not assured, although common scoring guides can significantly increase the likelihood that the assessment process will result in grades and academic reports that can be considered both valid and reliable.

Comparable refers to the ability of educators to have a high degree of assurance that proficiency assessments are consistent from student to student and teacher to teacher. When results are comparable, educators have a high degree of confidence that a consistent judgment about learning results and “proficiency” has been made. While specific learning outcomes may differ from student to student (some will learn more, others less), comparability is achieved when the same definition of proficiency has been consistently applied across diverse students, learning experiences, and subject areas.

Designing and Assessing Homework

In a proficiency-based system, homework—i.e., assignments completed largely outside of the classroom and without direct support and supervision from teachers—should be instructionally purposeful and connected to clearly defined learning standards. The Great Schools Partnership recommends that teachers consider the following general guidelines when assigning homework in a proficiency-based learning environment:

  1. All homework assignments should be relevant, educationally purposeful, and driven by clearly defined learning objectives for a unit or lesson.
  2. Students should be given an equal and equitable opportunity to complete all homework assignments. Given that some home situations may complicate a student’s ability to complete an outside-of-class assignment—such as households that have no computers or internet connection—schools and teachers need to ensure that every student has access to all necessary materials, technologies, and resources regardless of their socioeconomic status, language ability, disability, or home situation.
  3. The failure to complete or turn-in homework on time should not affect a student’s academic score unless the work being done outside of class is part of a larger summative assessment.
  4. The failure to complete or turn-in homework on time may be reflected in a student’s habits-of-work grade.
  5. Students should be given additional opportunities to improve, complete, and resubmit homework as an additional demonstration opportunity when reasonable and appropriate. If the assignment is part of a larger summative assessment, the improved scores should be counted, not earlier scores or a combination of scores.
  6. Teachers should provide feedback in a timely fashion so that students know how well they performed before they take the next assessment.
  7. The purpose of all homework assignments should be clearly articulated to and understood by students; specifically, students should know what learning objectives and performance indicators the assignment addresses, and what criteria will be used if the homework assignment is going to be assessed.
  8. Students should know in advance if a homework assignment is going to be assessed, and whether the assignment will be a formative assessment or a graded part of a larger summative assessment.
  9. To the extent possible, homework should be differentiated for students, which includes, when appropriate, student-designed learning tasks and projects that allow them to demonstrate proficiency in ways that engage their personal interests, ambitions, and learning needs.

Creative Commons License
Proficiency-Based Learning Simplified by Great Schools Partnership is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Share →