Evaluation
This evaluation plan consists of all four levels of Kirkpatrick, and seeks to determine if the course, once launched, meets its goals within our organization.
E-learning Goals
The institutional goal of this class is to reduce class sendouts and both in- and out-of school suspensions, as time out of class has a profound and negative impact on student learning. Additionally, students learn better in a calm, positive environment where they know they are valued and respected. The mission of my current organization is to “educate responsible citizen-scholars for success in the college of their choice and a life of active citizenship.” A sense of belonging and community is essential for promoting civic engagement, and time-on-task will help students achieve their highest academic potential.
Kirkpatrick Levels
Level 1 Reaction: On a very basic level, we’ll asses metrics like how many of the teachers that begin the course finish it, how many teachers report that the course was useful/changed their teaching in extremely short surveys throughout the course, and how many coaches have followed up with their teachers to discuss aspects of the course and to practice course skills. These assessments rely on self report, and of course have limited use. Basically, I think of them as “canary in a coal mine” assessments– if the data is positive, then we haven’t learned much. But if the data is negative? Then we know we have a very big problem with the foundation of the course and need to quickly take a closer look to find out why.
Level 2 Learning: For Kirkpatrick Two, the course will include formative assessments throughout, including some MC questions, some sorting activities, and potentially other interactives, both embedded in Canvas or built through other systems (H5P, Smartbuilder). Learners will get immediate feedback on their formative assessments, including both the correct response and a brief explanation of why it was correct so that learners can see their progress through the course objectives.. As a summative assessment of learning, students will participate in a practice conversation with another teacher in their cohort, using a case study/scenario either of their choosing or of the ones provided. Teachers will record their practice session for evaluation by their coach. Additionally, as the course creator, I can watch videos to evaluate progress and the efficacy of the program. These conversations will be scored with a rubric, with coaches normed on scoring in a pre-launch meeting. Norming the use of the rubric will help ensure the validity of our assessment, as we can move closer to ensuring that the rubric is used fairly and consistently.
Level 3 Behavior: Because Kirpatrick 3 is about behavioral change, any assessment in this level needs to be practical and observational. Because the course is taken as a cohort, it might make sense for learners to engage in cohort discussions at regular intervals after the end of the course to check in. One assessment could be a discussion board at two weeks, six weeks, and 9 weeks after the course where teachers share examples of how their language has changed with students, or where they provide “case studies” of how they applied what they learned in the course in actual interactions with students. While self-report is not always a reliable assessment of skill, asking teachers to describe very specific use of the course skills can give a sense of how things are going. And by having that as a discussion, they might also get a side benefit of camaraderie with their cohort.
To accompany these discussions, coaches will also be asked to observe teachers’ interactions with students at regular intervals (two, six, and nine weeks). These observations can be evaluated using a similar rubric as the summative assessment of the course, with performance dimensions and levels aligned to learning objectives. Coaches will share their observations with learners in coaching meetings, providing additional feedback to continue improving teachers’ skills in this arena. For programmatic evaluation purposes, coaches will also share their data with the instructional design team. If possible the instructional design team will also have the chance to observe teachers, either in person or by watching videos of classes or of conversations between teachers and students. The latter may be challenging to obtain, and it will be important to shield the identity of students if such videos are created.
Level 4 Results: To assess organizational impact, coaches and the instructional designer will look at a variety of metrics that can help determine if learners have actually moved their schools closer to organizational goals. These include:
- Using our already existing data dashboards (through DeansList and Google Tools) to compare the number of send-outs per learner before and after the course (with the limitation that there may be other causal factors that impact the number of sendouts in a class);
- Comparing the number of in- and out-of-school suspensions among the students of participating learners to see trends;
- If possible, examine send-out and suspension data to identify “frequent fliers”– students who have several behavioral interventions that might skew numbers. One ideal measure here would be a reduction in the number of such students at a school, or a reduction in the number of interventions such a student might receive, which might indicate stronger teacher/student relationships that might impact behavior. This one is inexact, though.
- Using existing student surveys in our NYC schools to compare pre- and post-program results on question/prompts such as:
- Adults at this school communicate with me in a language that I can understand.
- My teachers treat me with respect.
- Conflicts are resolved fairly in this school.
- When a conflict arises, school staff use questions to encourage reflection and resolve it.
- Students can share their perspectives collaboratively when making decisions on how to address conflict.
- Using existing family surveys in our NYC schools to compare pre- and post-program results on questions/prompts such as:
- Conflicts are resolved fairly at my child’s school.
- Students can share their perspectives collaboratively when making decisions on how to address conflict.
- Performing a qualitative analysis of teacher language in student referrals (emails sent to explain the situation when a student is sent out of class) to look for language that discusses students in a respectful manner, and to ensure that send-outs are not for behaviors that are developmentally appropriate (since children should not be punished for acting their age).
The New York City school surveys are not a perfect tool, but they do offer an external measure of student attitudes that we can use to see if the outcomes we think we are seeing through internal assessments align with that external assessment. If there is a significant gap, we know that one or more assessment tool is perhaps not reliable and will need to be adjusted.
Assessment Techniques and Tools
Most of our electronic assessment tools will be embedded in Canvas, although the course might also use H5P or SmartBuilder interactives. Because Canvas is fairly robust, and because it offers a free platform, it is ideal for this course, which has zero budget attached to its creation. The various interactive tools available in Canvas provide opportunities for in-platform feedback, quizzes with immediate feedback, options for coaches (as co-teachers in the platform) to assess participant progress.
Example Tools:
- Online quizzes– these MC quizzes will provide question stems and four answers for learners to choose from. These quizzes may cover developmentally-appropriate behaviors from the course text book, or principles of de-escalation from a course reading.
- Sorting activity– learners will sort potential responses to students into two categories– effective and ineffective, with immediate feedback provided. For example:
- Scenario: A student comes to your first period 9th grade Algebra class ten minutes late, having just arrived at school. You meet him at the door, saying: “You’re late! The assignment is on the board. Grab a seat and get working!”. (this communication should be sorted as ineffective)
- Answer: If correct– You are right– this communication is Ineffective– by starting the conversation with a reminder that the student is late, you are pointing out his tardiness without any information as to why he is late, potentially embarrassing the student. Instead, try saying, “Good morning– I’m glad you made it! The assignment is on the board. Grab a seat and get working, and let me know if you need anything to get started!” This response is warm and direct, offering the student the opportunity to join class without making any value judgment as to his arrival time
- If incorrect– This communication is ineffective–by starting the conversation with a reminder that the student is late, you are pointing out his tardiness without any information as to why he is late, potentially embarrassing the student. Instead, try saying, “Good morning– I’m glad you made it! The assignment is on the board. Grab a seat and get working, and let me know if you need anything to get started!” This response is warm and direct, offering the student the opportunity to join class without making any value judgment as to his arrival time.
- A Zoom recording of a practice role-play conversation between the learner and a “student,” which will be recorded and shared with the learner’s coach. Our current organization uses Zoom as its online meeting platform, and it is easy to record meetings in that platform, making this an low-lift for participating learners.
- Rubric evaluation in Canvas
Evaluation Plan
The sections above covered Level 3 and Level 4 assessments, which encompasses programmatic evaluation against our intended outcomes. A couple of points to emphasize:
- Any time an assessment asks a learner to self-report, there is a corresponding assessment by a coach or an outside measure to validate that data. In this way, we can gather some valuable insight into the minds of our learners while also using more objective measures of the program.
- The data that we collect with inform course redesign in a number of ways:
- If learners are performing poorly on formative assessments throughout the course, we’d want to look at how information is presented to see if we can improve information acquisition.
- If learners are not completing the program, we’d want information as to why, perhaps sending surveys to “drop outs” or asking coaches to follow up to see what the barriers to completion were.
- If learners are struggling with the summative assessment, we’d need to determine if our course materials actually promote the kind of skill building we are hoping to see.
- If teachers perform well on the summative assessment but do not change their in-class behaviors, we’d redesign to make the program more behavior-based, perhaps with more one-to-one coaching.
- If school metrics do not change (send-outs either do not change or increase; suspensions do not change or increase), we’d need to check to see if our assumptions are correct, and if we have chosen the right lever (student/teacher relationships) to move those numbers going forward.
Questionnaire
- What was your overall experience with the course?
- Would you recommend this course to a colleague? Why or why not?
- In what specific ways have you seen your classroom practice change because of this course?
- If you were tasked with revising this course, what is the biggest change you would make and why?