Introducing formative assessment: Helping international students adapt in a law degree programme

Author:

Terézia Švedová, Comenius University in Bratislava

Keywords:

(peer) assessment, continuous assessment, formative assessment, legal history

Summary:

This chapter evaluates the introduction of continuous assessment into an introductory legal history course through formative—a series of weekly continuous tests with personalised feedback—and varied assessment while prioritising student-centred learning. The aim of the innovation was to help students to increase their in-class activity, divide their workload more evenly throughout the semester and, consequently, to prepare for the final exam. As a result of the innovation students were active in the classroom, paced their learning more evenly and achieved much higher grades on the final exam and in the course in general.

This chapter evaluates the effectiveness of a two-step innovation of student assessment in a course on legal history. The course used to follow the traditional approach to assessment where the final exam accounted for 80% of the semester grade and the remaining 20% came from ‘points for activity’. How those points could be earned was not clarified and since the course could be passed without these points, students were little motivated to pay attention in class and study during the semester. They then had a hard time studying everything in a rush before the final exam, which carried the risk of failing the course. In order to remedy the situation, I redesigned the assessment criteria and introduced formative assessment through weekly revision tests. The more balanced and varied assessment process, together with the new incentives to prepare continuously were intended to help students to pass the course and improve their overall learning experience.

The innovation was informed by four pedagogical concepts. First, continuous assessment, which Le Grange and Reddy (1998) define as a judgement about the learners’ performance made on an on-going basis or as an on-going measurement of the learning process through a variety of assessment tools. The purpose of continuous assessment, as opposed to the traditional—one-time, often written—assessment, is to enable the evaluation of a wider range of educational outcomes as well as to provide information about the learning process and the learner’s development. The most significant benefit of continuous assessment is enhancing all aspects of the learning process (intended learning outcomes, instructions, resources, teaching methods, assessment etc.) to meet individual student needs (Le Grange and Reddy 1998).


Second, formative assessment, which has the potential to strengthen the effectiveness of continuous assessment, if carried out throughout the learning process and used to inform and influence learning (Le Grange and Reddy 1998). It allows the teacher to monitor student learning, comprehension and progress during the semester, and thus, to adjust the teaching process, flexibly react to the specific needs of students and give them feedback about their strengths and weaknesses (Schildkamp et al. 2020; Evans et al. 2014; Ozan and Kincal 2018). Formative assessment has been used to successfully resolve teaching challenges similar to those I faced. For example, Gachallová (2018) relied on formative assessment by using online quizzes to improve the learning and understanding of medical terminology. Inspired by such examples, I introduced both continuous and formative assessment to the course.


Third, student-centeredness puts the focus of the educational process on the students and their learning experiences. The teacher becomes a facilitator of the learning process while students are the ones responsible for their own learning (Dongl et al. 2019; Keiler 2018; Kaput 2018). Therefore, in-class activities were added to replace frontal lecturing, and how students could get credit for these was clarified. In addition, assignments requiring students to undertake their own research and develop critical thinking—rather than just passively absorbing information presented by the teacher—were also included.


Fourth and relatedly, constructive alignment assumes a teaching (learning) system where all aspects of teaching and assessment are in harmony and tuned to support high-level learning (Biggs 1996). This concept guided the reconstitution of the end-of-semester assessment criteria. The newly added weekly revision tests were aligned with the content and activities of all the previous course sessions, while the varied assessment methods were matched with the teaching methods (Biggs 1996; Loughlin et al. 2021; Hailikari et al. 2022).


Varied assessment

The redesigned assessment criteria prioritised continuous assessment, and thus, continuous learning. Continuous assessment is spread throughout the semester and accounts for 70% of the final grade. It is divided into three components: (1) weekly revision tests, (2) student presentations on a selected topic that consists of conducting legal research, writing argumentation, writing pleading, and oral presentation, and (3) in-class activity. The remaining 30% is for the final exam: a case study and addressing questions discussed in class.


Table 1. The newly designed, varied assessment criteria for History of Private Law


Content

Percentage of final assessment

Continuous assessment

Weekly revision tests

10

Legal research, written argumentation, written
pleading and oral presentation with visual aids on a selected topic

40

In-class activity

20

Final assessment

Oral exam (case study and oral discussion)

30


Weekly revision tests

Since the course was held online, the weekly revision tests were also administered online using Microsoft Forms, which allowed effectively providing feedback by uploading comments for each question. The tests were cumulative: Each test contained ten questions addressing the issues from previous class sessions. The tests started out with multiple/choice questions only, but during the semester the number of multiple-choice questions gradually decreased while the number of open-ended questions increased (Table 1). As for the open-ended questions, there were sometimes shorter case-study questions requiring students to read and interpret a legal text, e.g., an extract from materials covered in class.


Table 2. The ratio of closed and open-ended questions in the weekly revision tests


Week

3

4

5

6

7

8

9

10

11

12

Closed vs open

10:0

9:1

8:2

7:3

6:4

5:5

4:6

3:7

2:8

1:9


Since the tests were intended to promote continuous learning and provide formative assessment, only pass/fail grades were provided. The tests counted toward the final grade and students had to score at least 60% on them, which was expected to motivate students to continuously study for the tests without overly stressing them. Students received one point for successfully passing each test (1% of their final grade). I provided individual feedback for students after each test, highlighting the strengths and areas for further improvement. Students could also ask questions after the test, and I brought up questions that gave trouble to most of them.


Based on the above revision of course assessment, I expect that the weekly revision tests help students divide the workload evenly throughout the semester (H1); the innovation helps students prepare for the final exam (H2); and the innovation enhances the student learning experience (H3).

In September 2019, I started my PhD studies at Comenius University in Bratislava, Faculty of Law. During my first semester, I facilitated three sessions of the course History of Private Law and have continued to co-teach the course ever since. The course is taught to first-year international students studying in the English-language Master’s degree programme. It is a mandatory course in as much as students must take one legal history class. The class meets once a week during the thirteen-week semester. The composition of the class varies each year, and so far I have taught students from the USA, Spain, China, Switzerland, Russia, Ukraine, Georgia and Slovakia.


The innovation was implemented in the History of Private Law course in the winter 2021 semester and, because of the ongoing pandemic, the classes were held online. The intended learning outcomes of the course were to define selected legal institutes, to compare legal institutes in several legal systems, and to critically examine materials and legal sources to solve a legal problem. Due to the ongoing COVID-19 pandemic, the course was only attended by three students – from Russia, Ukraine and Georgia – and facilitated by three teachers.

In order to evaluate the effectiveness of the innovation, four different types of data were collected throughout the semester (Table 3). The first data source is an anonymous student feedback form designed specifically for this research. It consisted of ten questions asking about student perceptions of their preparation during the semester, comparison with other courses and overall opinion of the course. It was completed by all three students prior to the final exam.

 

Second, the two colleagues with whom I co-taught the course observed one classroom session each and filled out an observation form with seven questions about issues like how they see student preparedness, improvement in knowledge and skills, and student engagement in or resistance to learning compared to when they taught them.

 

Third, I kept a reflective teacher’s diary which recorded my observations about the same issues as the observers. Since these two sources are complimentary, I use them in parallel in the analysis.

 

Finally, I relied on data about student performance during the analysis. I used the students’ grades on the weekly revision tests and final grades to evaluate their learning. Furthermore, to understand not only if the students did well on the final exam but also to evaluate their performance in comparative perspective, I used a matching design. Although data from the previous semesters were no longer available for this course, in the winter 2021 semester I taught the same course to second-year MA students enrolled in the Slovak-language version of the programme. That course had a similar design to that of the English-language History of Private Law course prior to the innovations, with one exception: the assessment ratio was 90:10 (not 80:20) in favour of the final exam. The exam was written and contained both practical as well as application questions. The course was attended by seven students. For the comparison I used the final grades as well as the grades received for each assessment type and item. In most cases I used actual data rather than descriptive statistics due to the small number of students.

 

Table 3. Data sources and assessment methods used to verify the hypotheses

 

Hypothesis

Data

Method of analysis

H1

Grades (control and treatment groups)

 

Comparative quantitative analysis in matching design

H2

Test results

 

Longitudinal, individual-level analysis

 

H2

Feedback on tests

H1 +
H2 + H3

Anonymous student feedback form

 

 

 Qualitative analysis

 

H2 + H3

Classroom observations

H2 + H3

Reflective teaching diary

 

Regarding H1, I found that students spread their studying out during the semester more evenly. First, as Table 4 shows, students regularly showed up in class, not even using the three allowed absences. Only student C missed classes with revision tests, and hence, received no credit for those tests. Continuous attendance also meant continuous preparation and active participation during the classes. I noted in my diary—and the observers noted the same thing during their respective observations—that the students did seem to be prepared for each class: at the very least they did prepare for the initial brainstorming activity that took place at the beginning of each session.


Secondly, they reacted to questions and even took the initiative to ask questions. The willingness to engage in discussion increased throughout the first few classes. When it comes to in-class activities and students’ active participation, two students believed they were more attentive in class when compared to other courses and that this helped them prepare for the final exam. Only the third student noted she had a hard time actively participating in class since she had a part-time job.


It is notable, however, how much their previous knowledge determined the level of their engagement. Student B seemed to have the best knowledge base and was also the most engaged during discussion. Student A’s level of knowledge varied based on the topic and while she always tried her best to engage, the quality of her contribution varied. Student C seemed to lack some basic knowledge and struggled a lot with learning in English and needed the most assistance. Her asking for re-formulation or repetition of questions not only signalled her language issues but also her willingness to engage.


Table 4. Test results of the treatment group in percentages. Grey areas mark unsuccessful attempts


 

Test number

Total points earned

1

2

3

4

5

6

7

8

9

10

Student A

65%

70%

35%

65%

35%

40%

65%

50%

40%

50%

4

Student B

85%

75%

80%

85%

65%

85%

80%

60%

80%

85%

10

Student C

40%

65%

45%

65%

45%

40%

40%

50%

2


Prior knowledge and language skills likely also influenced the test results. On the one hand, it is encouraging that the students reached at least 35% on the tests and that their performance did not decrease over time as the tests shifted from multiple choice to the more complex open-ended questions. Both my colleagues and I found that when referring to a certain concept discussed in previous classes, the students were familiar with them and did not need repetitive explanation. At the same time, as the semester went by, the students did not remember key information discussed earlier and it was clear they did not repeatedly revisit material from all of the previous class sessions.


On the other hand, the three students combined failed to achieve 60%—and thus receive credit—for more than one third of all their tests (see the cells highlighted in grey in Table 4). While Student B never failed a test, student A passed the threshold for credit only four times and student C only twice. In the feedback form, only one student—most likely student B—said that they prepared on a weekly basis for the tests. The other two students emphasised that they paid more attention during class rather than preparing for the test. This explains why they often performed worst on those open-ended questions that required in-depth knowledge.


The feedback from student A showed clearly that she struggled when more detailed information was necessary to answer the question and did best on multiple choice questions. The student apparently paid attention during in-class activities, since for her the least troublesome were topics covered in the week prior to the test. Also, when a certain knowledge gap was pointed out in the feedback and the issue was raised once again in a different test, the student did answer correctly next time even if the question occurred a few weeks later, with only one exception. As for student C, she did not study the materials thoroughly since she usually failed the questions requiring deeper knowledge (with a few exceptions when she achieved a higher percentage). At the same time, she did take my recommendations seriously, for example when I advised her to watch the recording of a course session that she missed the first time (test 2). She did so prior to test 4 and achieved one of the highest scores.


In addition to in-class activity and weekly revision tests, students wrote a research paper in several steps. This contributed to their continuous learning: their outstanding performance—they all received 39 out of the 40 points available on this assignment—was only possible with repeated in-depth engagement with the topic chosen for their paper (Table 5). All in all, students did study continuously throughout the semester, however, their engagement with the topic was not always as thorough as I wished.


Table 5: Treatment group student performance on assessment tasks during the semester in scores and percentages


 

Weekly revision tests

In-class activity

Research paper

Final exam

Total points

Letter grade

Maximum points

10

20

40

30

100

 

Student A

4

(40%)

19

(95%)

39

(97.5%)

28

(93.3%)

90

A

Student B

10

(100%)

20

(100%)

39

(97.5%)

29

(96.7%)

98

A

Student C

2

(20%)

17

(85%)

39

(97.5%)

28

(93.3%)

86

B

Assessment types

Continuous assessment

Final assessment

Final grade


The innovation positively impacted student performance on the final exam (H2). All three students did exceptionally well on the final exam achieving 93.3% or above (Table 5). More tellingly, they did much better than their Slovak peers in the control group. None of the Slovak students reached more than 90%, while five out of the seven students achieved less than 60% (Table 6). When comparing the average performance of the control and treatment groups on the final exam, the treatment group shows significantly (p=0.001) better score (94.43) compared to the control group (49.93) (see Table 7).


Table 6: Control group student performance on assessment tasks during the semester in scores and percentages


 

In-class activity

Final exam

Total points

Letter grade

Maximum Points

10

90

100

 

Student D

10

(100%)

26

(28.9%)

36

FX

Student E

5

(50%)

45

(50%)

50

FX

Student F

7

(70%)

19

(21.1%)

26

FX

Student G

8

(80%)

75

(62.5%)

83

B

Student H

4

(40%)

49.5

(55%)

53.5

FX

Student I

9

(90%)

38

(42%)

47

FX

Student J

10

(100%)

81

(90%)

91

A

Assessment types

Continuous assessment

Final assessment

Final grade


Although the in-class activity of most control group students was fairly high (70% or above), four students did noticeably worse on the final exam than they did during the continuous assessment, suggesting little connection between the two. When we look at the performance of the international students, their in-class activity and research paper grades were high compared to their final exam performance. Some disconnect exists between the treatment group’s weekly revision test results and their final exam scores, but since the tests were primarily designed as a formative assessment instrument and the students did react positively to feedback as I described above, it is reasonable to conclude that the weekly revision tests had a positive impact on their performance just as did the varied assessment criteria.


Student opinion supports this interpretation. All the students found the feedback provided after each test very useful either for the next revision tests or for the final exam. Two students said they felt better prepared for the final exam than in other courses due to the repetitive nature of the revision tests. The third student thought the effort she made to prepare for the final exam was comparable to those of other courses. In addition, the new assessment structure that put more emphasis on active learning proved valuable as all three students said the in-class activities helped them with their learning. Nonetheless, it is important to note that, based the available data, we can establish that the innovation as a whole helped student performance on the final exam, but it is not possible to discern the extent to which each element of the new assessment scheme contributed to student preparations for the final exam.


Table 7. Comparing the average performance of the treatment and control group students on the final exam


Group

N

Mean

Standard deviation

t-score

p-value

Control

7

49.93

22.82

5.115

0.001

Treatment

3

94.43

1.96


Finally, the student learning experience noticeably improved (H3). The students said that they found the course enjoyable. One student pointed out she felt comfortable thanks to the teachers’ approach, the other students enjoyed the various activities and tasks. My colleagues and I observed that the students felt comfortable during classes and enjoyed the activities. None of the students thought the weekly revision tests were too much of a burden. They appreciated that the questions did not require lengthy answers and that the tests created an opportunity to revise the materials and earn points toward the final grade. The students did not enjoy the questions that referred to longer legal texts, the possibility of failing, and the fact that the test was administered early in the morning due to time zone differences. Overall, the current course design provided students with a positive learning experience.

This innovation is suitable for any course (either humanities or branches of science) that lacks continuous preparation and for which a certain knowledge base is required, such as introductory-level courses. Based on the possibilities open to the teacher, the innovation can be adopted in full (both steps) or by just adding revision tests. Finally, the approach (testing on a weekly basis including open-ended questions and individualised feedback every week) is most suitable for courses with a small number of students where the teacher can provide individualised feedback.

 

Specifically, the continuous assessment aspect of the course—weekly online revision tests—proved to be the key to improving student performance on the final exam. I will keep these in the future iterations of the course and will administer the tests online even when normal face-to-face teaching resumes because it makes providing feedback efficient. Since it was laborious to provide individualised written feedback for every student each week, I suggest either decreasing the frequency of the tests (e.g., twice a month) or decreasing the number of questions, e.g., to only five or six.

To sum up, the new, varied assessment methods with the weekly revision tests aided students in performing better both at the final exam and in the course in general and enhanced their learning experience. The weekly revision tests helped students divide their workload more evenly throughout the semester: although I would have preferred them to prepare for the tests more thoroughly, it seems that even a more modest level of continuous learning can help students succeed. The changes, however, cannot be solely attributed to my innovation in assessment. The course co-teachers changed other aspects of the course as well (see Hron’s chapter in this volume), which makes it difficult to separate out the exact effect of each new feature. In the future, it would be useful to investigate what level of continuous learning is ideal in order not to overburden the students.

  • Biggs, J. (1996) ‘Enhancing teaching through constructive alignment’, Higher Education 32, pp. 347-364.
  • Dongl, Y., Wu, S.X., Wang, W. and Peng, S. (2019) ‘Is the student-centered learning style more effective than the teacher-student double-centered learning style in improving reading performance?’, Frontiers in Psychology 10, available at: https://www.frontiersin.org/articles/10.3389/fpsyg.2019.02630/full.
  • Evans, D.J.R., Zeun, P. and Stanier, R.A. (2014) ‘Motivating student learning using a formative assessment journey’, Journal of Anatomy 224:3, pp. 296–303.
  • Gachallová, N. (2018) ‘Using an online quiz as a formative tool in Latin medical terminology courses’, in G. Pleschová and A. Simon (eds.) Early career academics’ reflections on learning to teach in Central Europe, London: SEDA, pp. 162-170.
  • Hailikari, T., Virtanen, V., Vesalainen, M. and Postareff, L. (2022) ‘Student perspectives on how different elements of constructive alignment support active learning’ Active Learning in Higher Education 23:2, pp. 217-231.
  • Kaput, K. (2018) Evidence for student-centered learning, Saint Paul, MN: Education Evolving, available at: https://files.eric.ed.gov/fulltext/ED581111.pdf.
  • Keiler, L.S. (2018) ‘Teachers’ roles and identities in student-centered classrooms’, International Journal of STEM Education 5, available at: https://stemeducationjournal.springeropen.com/articles/10.1186/s40594-018-0131-6.
  • Le Grange, L. and Reddy C. (1998) Continuous assessment: An introduction and guidelines to implementation, Johannesburg: Juta.
  • Loughlin, C., Lygo-Baker, S. and Lindberg-Sand, Å. (2021) ‘Reclaiming constructive alignment’, European Journal of Higher Education 11:2, pp. 119-136.
  • Ozan, C. and Kincal, R.Y. (2018) ‘The effects of formative assessment on academic achievement, attitudes toward the lesson, and self-regulation skills’, Educational Sciences: Theory & Practice 18:1, pp. 85-118.
  • Schildkamp, K., van der Kleij, F.M., Heitink, M.C., Kippers, W.B. and Veldkamp, B.P. (2020) ‘Formative assessment: A systematic review of critical teacher prerequisites for classroom practice’, International Journal of Educational Research 103, available at: https://research.utwente.nl/en/publications/formative-assessment-a-systematic-review-of-critical-teacher-prer.