Please take our brief survey

Blueprints Programs = Positive Youth Development

Return to Search Results

Model Program Seal

Positive Action

Blueprints Program Rating: Model

A school-based social emotional learning program for students in elementary and middle schools to increase positive behavior, reduce negative behavior, and improve social and emotional learning and school climate. The classroom-based curriculum teaches understanding and management of self and how to interact with others through positive behavior, with school climate programs used to reinforce the classroom concepts school-wide.

  • Academic Performance
  • Alcohol
  • Anxiety
  • Bullying
  • Delinquency and Criminal Behavior
  • Depression
  • Emotional Regulation
  • Illicit Drug Use
  • Positive Social/Prosocial Behavior
  • Sexual Risk Behaviors
  • Tobacco
  • Truancy - School Attendance
  • Violence

    Program Type

    • Alcohol Prevention and Treatment
    • Drug Prevention/Treatment
    • School - Environmental Strategies
    • School - Individual Strategies
    • Skills Training
    • Social Emotional Learning

    Program Setting

    • School

    Continuum of Intervention

    • Universal Prevention (Entire Population)

    A school-based social emotional learning program for students in elementary and middle schools to increase positive behavior, reduce negative behavior, and improve social and emotional learning and school climate. The classroom-based curriculum teaches understanding and management of self and how to interact with others through positive behavior, with school climate programs used to reinforce the classroom concepts school-wide.

      Population Demographics

      The program utilizes a PreK-12 curriculum, but the evaluations have been conducted primarily with children and youths in kindergarten through eighth grades. Blueprints-certified evaluations have results of the program through eighth grade for students in high-risk, urban schools. Although not certified by Blueprints, a curriculum for preschool children has been developed and evaluated in a pilot randomized trial.

      Age

      • Late Childhood (5-11) - K/Elementary
      • Early Adolescence (12-14) - Middle School

      Gender

      • Male and Female

      Race/Ethnicity

      • All Race/Ethnicity

      Race/Ethnicity/Gender Details

      There is some evidence in the Hawaii data that this program is more effective for boys than girls, but only in reducing violent behaviors and sexual activity at 5th grade, behaviors that girls of this age engage in very rarely. In the earlier waves of the Chicago study, tests for gender- or race-specific effects were not reported. By Wave 8 (eighth grade), tests for gender differences were reported, with results showing less of a decline in social-emotional and character development over time for boys compared to girls. The Chicago study was conducted in high-risk, urban schools with a large minority sample. Results replicated findings from the Hawaii trial, which was conducted in a mix of urban, suburban, rural and remote island schools.

      • Family
      • School
      • Peer
      • Individual
      Risk Factors
      • Individual: Antisocial/aggressive behavior, Bullies others, Early initiation of antisocial behavior, Early initiation of drug use, Favorable attitudes towards antisocial behavior*, Favorable attitudes towards drug use, Physical violence, Rebelliousness, Substance use, Victim of bullying
      • Peer: Interaction with antisocial peers, Peer substance use
      • School: Low school commitment and attachment, Poor academic performance, Repeated a grade
      Protective Factors
      • Individual: Academic self-efficacy*, Clear standards for behavior, Exercise, Perceived risk of drug use, Problem solving skills, Prosocial behavior, Prosocial involvement, Refusal skills, Rewards for prosocial involvement, Skills for social interaction*
      • Peer: Interaction with prosocial peers
      • Family: Attachment to parents, Opportunities for prosocial involvement with parents, Rewards for prosocial involvement with parents
      • School: Opportunities for prosocial involvement in education, Rewards for prosocial involvement in school

      *Risk/Protective Factor was significantly impacted by the program.

      See also: Positive Action Logic Model (PDF)

      Positive Action (PA) is a school-based program that includes school-wide climate change and a detailed curriculum with lessons 2-4 times a week—approximately 140 15-minute lessons per grade K-6 and 82 15-20 minute lessons per grade 7 and 8. Lessons for each grade level are scripted and age-appropriate. All materials necessary to teach the lesson are provided including posters, puppets, music, games, and other hands–on materials integrated into the lessons. Students’ materials include activity booklets, journals and other lesson aids. The content of the program is included in six units that form the foundation for the whole program. The first unit teaches the philosophy of the program and the Thoughts-Actions-Feelings about Self Circle, and provides an introduction to the nature and relevancy of positive and negative actions/behaviors. Units 2-6 teach the positive actions for the physical, intellectual, social and emotional areas. There are two school-wide climate development kits (elementary and secondary) and a Counselor’s Kit. The contents delivered through the climate development and counselor kits reinforce the classroom curriculum by coordinating the efforts of the entire school in the practice and reinforcement of positive actions.

      Positive Action (PA) is a school-based program that includes a detailed curriculum with lessons 2-4 times a week—approximately 140 15-minute lessons per grade K-6, and 82 15-20 minute lessons per grade 7 and 8. The content of the classroom curriculum is taught through six units, which teach the following:

      1. The Positive Action Philosophy and the Thoughts-Actions-Feelings about Self Circle This unit provides the conceptual foundation for the content of the program delivered in Units 2-6 and teaches generally about positive and negative actions and their meaning for and application to life. The remaining units teach the specific positive actions for the whole self: the physical, intellectual, social and emotional.
      2. Positive Actions for Body and Mind - This unit focuses on nutrition, exercise, sleep, hygiene and other good health habits for the physical area, and thinking skills, problem solving, decision making, memorizing, reasoning, thinking creatively, curiosity study skills and the value of learning for the intellectual area.
      3. Social/Emotional Positive Actions for Managing Yourself Responsibly - Students are taught to manage their personal resources like time, energy, thoughts, actions, feelings, money, talents and possessions, including basic self-control or self-regulation skills.
      4. Social/Emotional Positive Actions for Getting Along with Others - Students are taught to get along with others by treating them the way they would like to be treated, so they learn about respect, empathy, kindness, fairness, and cooperation and other ways they like to be treated.
      5. Social/Emotional Positive Actions for Being Honest with Yourself and Others - Students are taught to be honest with themselves and others by responsibility taking, learning how to be truthful, admitting to mistakes, not blaming others, knowing their own strengths and weaknesses, and following through with commitments.
      6. Social/Emotional Positive Actions for Improving Yourself Continuously - Students are taught how to set and achieve goals for all areas of themselves and learn how to reach goals by having the courage to try, turning problems into opportunities, believing in their potential, persisting and keeping an open mind in order to broaden their horizons.

      School-climate programs (elementary and secondary) are also utilized. They reinforce the classroom curriculum through coordinating the efforts of the entire school in the practice and reinforcement of positive actions. The school principal and a PA Committee administer this component with representatives from the faculty at each grade level, support staff, parents, students and community members. The principal is responsible for 1) initiating the adoption process, 2) appointing a PA coordinator and a PA committee, 3) coordinating training and professional development workshops and work groups, and 4) coordinating multiple resources. To encourage positivity throughout the school, principals are encouraged to use the provided materials -- such as stickers, tokens, posters, music CDs, words of the week cards, certificates, balloons, and ICU Doing Something Positive boxes. For the secondary level there is a PALs Club with membership cards, a Peace Flag, Buzz Words and SOS Boxes. Principals are also provided with information on creating newsletters, and conducting assemblies and celebrations for Positive Action.

      PA also includes a Counselor's Kit which contains curriculum and materials that provide school counselors, social workers and school psychologists with the resources and information needed to do mentoring, peer tutoring, and support group programs, useful for students who may need more intense help than they are getting in the classroom. It contains a Topical Guide, which indicates which lessons and units to use for a specific subject of focus.

      Optional: Positive Action comes with optional supplements and kits that have not been certified by Blueprints.

      The Bullying, Fifth Grade Drug, Middle School Drug and Conflict Resolution Kits can be used with the regular PA curriculum or stand alone. The two or three lessons for each unit from these curricula can be added to the end of each unit to focus the unit topic on the subject of the kits; or the supplement kits can stand alone in their entirety.

      A family component provides parents with the opportunity to deliver a family curriculum. The Positive Action Family Kit contains 42 lessons, posters, music, games, activity sheets, Conflict Resolution Plans, Problem Solving and Decision Making Checklists, Words of the Week cards, and an ICU Doing Something Positive box and other materials for use at home. The Family Classes Instruction Kit provides seven two-hour lessons for parents, adolescents, and children to learn how to implement the Positive Action curriculum at home. There is also a Parent Classes Kit of seven one-hour classes. These components also encourage parents to become more involved with the school through participation on the PA Committee, attending PA assemblies and through volunteer work.

      Finally, a Community program is also available for use with coalitions and other community development groups. This program seeks to organize the community to do community-wide PA events and outlines projects to be done by sub-groups of the community, such as mental health, media, business, law enforcement and judicial. The Community/Coalition Kit contains a manual for the PA Community Committee to use to take the program community-wide. It also contains a Family Kit, a Counselor’s Kit, a Conflict Resolution Kit and a Media Kit.

      A program for preschool children and a stand-alone version of a family program have been evaluated in pilot randomized trials.

      The implementations for the two randomized trials in Hawaii and Chicago were conducted in K-5/6 or K-8 schools in Hawaii and Chicago, respectively. The program was implemented school-wide, utilized the school-wide climate change and counselor kits, and provided the curriculum to all grades in the trial schools and parent manuals to all parents. However, due to late start-up, holidays and test schedules, teachers delivered the curriculum for only 20-25 weeks per year. Teachers were allowed to combine or skip lessons (and were pointed to key lessons) in order to catch up. The teacher/school trainings generally consisted of one half day at the beginning of each year in Hawaii schools and a little less in Chicago schools.

      The program, grounded in the broader theory of self-concept, teaches youth that making positive and healthy behavioral choices results in feelings of self-worth. It is the whole behavior process that is needed to change behavior. PA brings to a conscious level the Thoughts-Actions-Feelings about Self Circle. It teaches that thoughts come before actions and that we need to be conscious of our thoughts because that is where the decision is made as to how we are going to act. Furthermore, we need to be careful of our actions, because once we have acted, we can’t take them back and that, for every action, there is a reaction and we want students to tune into the feeling about themselves they receive from the action because it will shape further thoughts. When thoughts, actions and feelings about ourselves are positive, we feel good about ourselves, and that determines our feelings of self-worth.

      Positive Action develops intrinsic motivation because our need to feel good about ourselves is a very powerful motivator, more so than extrinsic rewards; these have to be constantly increased and the behavior will stop when they cease. By explicitly linking thoughts, feelings, and actions, the program is believed to enhance the development and integration of affective and cognitive brain functions. Since problem behaviors are correlated and share several of the same predictors, this program applies a comprehensive approach to addressing the predictors of youth problem behaviors that includes self-concept development, school-wide environmental change, and parental and community involvement in an attempt to successfully affect multiple outcomes (e.g., academic performance, violence, and drug use). It is believed that the program itself will positively impact both children's knowledge and skills (character/self-concept, learning/study skills, self-management, interpersonal/social skills, self-honesty and responsibility, and goal setting/future orientation) and school and classroom outcomes (improved relationships amongst school administrators, teachers, students, and parents; improved classroom management; increased involvement of school with parents and community). One can expect changes in children's attitudes towards their behaviors, attachments, normative beliefs, academic and social skills, self-efficacy, and social and character development. Such changes should further lead to fewer disciplinary problems, improved school attendance and grades, and reduced emotional problems, violent behaviors and substance use.

      • Cognitive Behavioral
      • Person - Environment
      • Social Learning

      Two primary evaluations utilize clustered, randomized designs in which schools were matched and then randomized to a treatment or control condition. The first was conducted in Hawaii and the second was conducted in Chicago, Illinois. Both studies matched schools into pairs after screening them for eligibility. Matching criteria were similar across both studies, considering variables such as demographics, school size, mobility of students, ethnicity, student/teacher ratios, student characteristics such as special education and gifted status, disciplinary referrals, suspension rates, and standardized achievement scores. Schools in each pair were then randomly assigned to either the treatment or control condition. There were 10 matched pairs in Hawaii and 7 in Chicago. In the Hawaiian study, students in the first and second grade (two distinct cohorts) in the year prior to program implementation (2001-02) were followed through the end of the fifth and sixth grades (spring 2006). Program implementation began in the fall of 2002. In the Chicago study, students in 3rd grade during the 2004-2005 school year were followed through the spring of 2010 (end of grade 8). Analysis did not include students who moved out of study schools, but did include new, incoming students for each of the program years.

      Other studies used retrospective school-level archival data to employ a matched-control research design. A recent test of the preschool program randomly assigned students to classrooms in three Virginia preschools.

      In a study conducted in North Carolina (Guo et al., 2015), two rural, economically disadvantaged counties participated; all middle schools in the intervention county received the intervention for three years, and the schools in the other did not. Sample size ranged across waves from 3715 to 5894, but the matched samples used in the analysis ranged from 1246 to 1968. Primary outcome measures included self-esteem, school hassles, aggression, and internalizing symptoms. Drawing from the same data set, Smokowski et al. (2016) investigated whether the amount of exposure to the PA program (i.e., number of years participating in PA and number of PA lessons) was associated with adolescent functioning, including internalizing symptoms, aggression, school hassles and self-esteem. A quasi-experimental design was used in which 5,894 students who received different doses of PA were matched with students who received no PA.

      The randomized Hawaiian program evaluation revealed that, after three program years, Positive Action significantly reduced grade retention, suspensions, and absenteeism, increased reading and math scores on Hawaiian standardized tests, and improved student- and teacher-reports of school supportiveness across program schools, relative to control schools. Program youth reported significant improvements over time, relative to controls, in how they felt about themselves after engaging in positive behaviors. At the one-year post program follow-up, school-level data showed PA schools scoring significantly better than control schools on standardized tests of reading and math and had lower absenteeism and suspensions. At the end of fifth grade, PA students were significantly less likely to engage in substance use, violent behaviors, and sexual activity (self-report). Teachers also reported less violence, but teacher-observed substance use was not significant. One year post-trial, Positive Action schools reported significant improvements in school quality, compared to control schools.

      In Chicago, after three years (end of fifth grade), program youth made significant improvements, relative to controls, in honesty, self-control, problem solving, and affiliation with negative peers. Additionally, for program youth there was a 31% reduction in substance use behavior, 36% reduction in violence behavior, 41% reduction in bullying behaviors, and 27% (not significant) reduction in disruptive behaviors. The long-term follow-up to the end of eighth grade found significant program effects on substance use, normative beliefs supporting aggression, violence, bullying, emotional health, and health behaviors. For example, students in the Positive Action schools were 20-39% less likely to have ever used tobacco, alcohol, or marijuana than students in control schools. There were also mediating effects via social-emotional and character development (SECD) on substance use, emotional health, and health behaviors: the students in the intervention schools had a more gradual decline in SECD than students in control schools, and this slower rate of decline in SECD was related to better wave 8 scores on substance use, positive affect, life satisfaction, depression, anxiety, personal hygiene, and healthy food consumption and exercise.

      The preschool study found significant improvement in teacher ratings of the 11 assessed student behaviors and attitudes.

      In the North Carolina independent evaluation (Guo et al., 2015), students in the intervention group showed significant improvements to self-reported self-esteem and school hassles when compared to students in the control county. The program did have minor iatrogenic effects on student internalizing scores. Using the same data, Smokowski at al. (2016) found that students who received 3 years of the PA intervention and a high number of PA lessons had a significantly higher self-esteem score than those who received 0 years of PA or zero lessons. However, higher dosage may have increased internalizing.

      Results of the randomized study in Hawaii revealed:

      • After three program years, there were school-wide reductions in grade retention, suspensions and absenteeism and school-wide improvements in reading and math proficiency and teacher- and student-reported school supportiveness for program schools, relative to control schools. These results were maintained through the one-year post implementation follow-up.
      • Significant improvements were found among Positive Action schools in school quality one-year post-trial, compared to control schools.
      • Fifth grade program youth were significantly less likely than controls to have engaged in self-reported substance use, violence, and sexual activity.

      Results of the randomized study in Chicago revealed that, compared to the control condition, the students and schools in the intervention condition showed significantly

      • higher socio-emotional and character development at grades 5 and 8
      • lower self-reported substance use at grades 5 and 8
      • lower self-reported violence at grades 5 and 8, lower parent-reported bullying at grade 8, and lower self-reported bullying at grades 5 and 8
      • higher life satisfaction at grade 8
      • lower depression and anxiety at grade 8
      • lower unhealthy food consumption at grade 8
      • lower school-level disciplinary referrals and suspensions at grade 8
      • better reading test scores at grade 8.

      Compared to the control group in an independent evaluation (Guo et al., 2015), students in the intervention condition showed:

      • Improvements in self-esteem and school hassles scores
      • Possible iatrogenic effect on internalizing symptoms

      Significant Program Effects on Risk and Protective Factors:

      • Intervention students, compared to controls, had lower disaffection with learning and higher teacher-rated academic motivation at grade 8 (Chicago Study)
      • lower normative support for aggression at grade 8 (Chicago Study)
      • better social interaction skills, evidenced by higher scores on the Social Emotional and Character Development Scale (Chicago and Hawaii Studies)
      • self-esteem (Guo et al., 2015)

      Compared to matched control youth in Smokowski et al. (2016), the intervention group reported significantly higher self-esteem but only for those receiving 3 years of the intervention and a high number of PA lessons.

      The analysis of mediating effects of Social Emotional Character Development (SECD) on substance use showed that students with higher SECD at Wave 1 had significantly lower substance use at Wave 8. Further, there was a significant indirect effect mediated by SECD with no significant direct effect of Positive Action on substance use remaining, demonstrating complete mediation (see Study 2). Other mediation analyses showed significant indirect effects of the intervention on measures of emotional health and health behaviors via SECD.

      In the Hawaii study (Study 1) and Chicago study (Study 2), most effect sizes were medium to large.

      There is some evidence that this program is more effective for elementary school boys than girls, but only in reducing violent behaviors and sexual activity at grade 5, behaviors that girls at that age engage in very rarely. These effects are evident in results of analysis of the Hawaii data. In the Chicago study, by Wave 8 (eighth grade), tests for gender differences showed less of a decline in some aspects of social-emotional and character development over time for boys compared to girls. The Chicago study was conducted in high-risk, urban schools. However the Chicago trial results replicated the results from the Hawaii trial, which was conducted in a mix of urban, suburban, rural and remote island schools. Both Guo et al. (2015) and Smokowski et al. (2016) used panel data collected as part of a 5-year project that involved rural, low-income counties in North Carolina.

      Hawaii study

      In the study of student outcomes (Beets et al., 2009), one-tailed tests of significance are used, and in Table 1 90% confidence intervals, but the violence results would still have been significant at the .05 level with 2-tailed tests. Because it was not reasonable to ask children in 1st and 2nd grade about violence, substance use, and sexual activity, the study did not have data on baseline equivalence. However, the prevalence of these behaviors would be near zero for both conditions in those grades, and additional results reported to Blueprints showed equivalence on a large number of other measures. For the same reason, it was not possible to examine differential attrition by baseline outcome variables, but additional results reported to Blueprints showed statistical equivalence across waves 1 and 5 in race and ethnicity.

      In the study of school quality outcomes (Snyder et al., 2012), the researchers were unable to compare results at the classroom or student level (because data were available from the school district only at the school level). However, random assignment of the schools should have provided for adequate equivalence of schools. Response rates for the parent reports were quite low (20.8% for both program and control schools), although response rates for teachers and students were very good and corroborate the results from the parent reports.

      Chicago study

      Li et al. (2011) found baseline differences across conditions for race but not age, gender, or problem behavior. The study also reported that data for covariates (though not outcomes) were imputed for about 42% of students (those who joined the study after the pretest). Analyses on academic outcomes (Bavarian et al., 2016) used one-tailed tests because of the small number of schools.

      There was a wide range in the level of fidelity to program implementation across program benchmarks (from moderate to high, with a range of between 22% and 100% of individual benchmarks being met), although generally schools were able to advance in readiness from year to year.

      Additional studies

      The Southeastern State Trial (included along with the Hawaii and Chicago trials in Washburn et al., 2011) was a third random assignment study, but lacking pretest data and measuring only one outcome, frequency of positive "character" behaviors. Nevertheless, results were consistent with results from the Hawaii and Chicago studies, and thus demonstrate generalizability to small rural schools.

      Two earlier quasi-experimental studies (Flay, Allred, & Ordway, 2001; Flay & Allred, 2003) used matched control schools that were selected retrospectively, after the program was implemented, based upon having readily available outcome data. The nonrandom assignment to the program and control conditions raises questions of the internal validity relating to the extent to which the program schools were different from the comparison schools prior to intervention. Though the schools in the different conditions were relatively well matched on three demographic variables (as well as baseline standardized test scores in Flay & Allred 2003), it is plausible that the selection process placed schools with different leadership, faculty cultures, organizational histories and resources, etc. in the program and comparison conditions. School disciplinary data are limited by several sources of unreliability that include variability in reporting, errors in reporting, and lower levels due to disincentives for teachers or principals to report disciplinary incidents. Nevertheless, the validity of the findings was supported by their subsequent replication in the two randomized trials.

      Guo et al., 2015:

      • nonrandom assignment, though with propensity score matching
      • sample sizes across waves described only in general terms
      • did not adjust for school level clustering, though clustering was small
      • groups differed on many baseline measures before matching, but not after matching
      • no information on attrition
      • possible iatrogenic effect (internalizing symptoms)

      Dosage Study (Smokowski et al., 2016) – same data as Guo et al. (2015) so the same limitations. In addition:

      • Dosage groups differed significantly at baseline on many measures
      • Across 12 tests of dosage measured in years and 16 tests of dosage measured in number of lessons, only two significant effects emerged

      Preschool Study

      • Outcome measures came from ratings of instructors who also delivered the program rather than independent observers.
      • The intervention students had higher mean scores at baseline on nearly all outcomes.
      • No analysis of differential attrition, but attrition was only 8%.
      • Narrow sample from three preschools in one state.
      • No long-term results.

      • Blueprints: Model
      • Crime Solutions: Effective
      • OJJDP Model Programs: Effective
      • SAMHSA: 2.2-2.8
      • What Works Clearinghouse: Meets Standards Without Reservations - Positive Effect

      Southbridge Public Schools
      25 Cole Ave
      Southbridge, MA 0155
      Contact: Nikki Murphy, SEL Director
      508-764-5415
      murphyn@southbridge.k12.ma.us

      Bavarian, N., Lewis, K. M., Acock, A., DuBois, D. L., Zi, Y., Vuchinich, S., ... Flay, B. R. (2016). Direct and mediated effects of a social-emotional learning and health promotion program on adolescent health outcomes: A matched-pair, cluster-randomized controlled trial. Journal of Primary Prevention, 37, 87-105.

      Bavarian, N., Lewis, K. M., DuBois, D. L., Acock, A., Vuchinich, S., Silverthorn, N., ... Flay, B. R. (2013). Using social-emotional and character development to improve academic outcomes: A matched-pair, cluster-randomized controlled trial in low-income, urban schools. Journal of School Health, 83(11), 771-779.

      Beets, M. W., Flay, B. R., Vuchinich, S., Snyder, F., Acock, A., Burns, K., ... Durlak, J. (2009). Use of a social and character development program to prevent substance use, violent behaviors, and sexual activity among elementary-school students in Hawaii. American Journal of Public Health, 99(8), 1-8.

      Flay, B. R. (2012). Randomized evaluation of the Positive Action pre-K program. Unpublished.

      Flay, B. R. & Allred, C. G. (2003). Long-term effects of the Positive Action program. American Journal of Health Behavior, 27(Supplement 1), 6-21.

      Flay, B. R., Allred, C. G., & Ordway, N. (2001). Effects of the Positive Action program on achievement and discipline: Two matched-control comparisons. Prevention Science 2(2), 71-89.

      Guo, S., Wu, Q., Smokowski, P. R., Bacallao, M., Evans, C. B. R., & Cotter, K. L. (2015). A longitudinal evaluation of the Positive Action program in a low-income, racially diverse, rural county: Effects on self-esteem, school hassles, aggression, and internalizing symptoms. Journal of Youth and Adolescence, 44, 2337-2358.

      Lewis, K. M., Bavarian, N., Snyder, F. J., Acock, A., Day, J., DuBois, D. L., ... Flay, B. R. (2012). Direct and mediated effects of a social-emotional and character development program on adolescent substance use. The International Journal of Emotional Education, 4(1), 56-78.

      Lewis, K. M., DuBois, D. L., Bavarian, N., Acock, A., Silverthorn, N., Day, J., ... Flay, B. R. (2013). Effects of Positive Action on the emotional health of urban youth: A cluster-randomized trial. Journal of Adoleslcent Health, 53, 706-711.

      Lewis, K. M., Schure, M. B., Bavarian, N., DuBois, D. L., Day, J., Ji, P., ... Flay, B. R. (2013). Problem behavior and urban, low-income youth: A randomized controlled trial of Positive Action in Chicago. American Journal of Preventive Medicine, 44(6), 622-630.

      Li, K. K., Washburn, I., DuBois, D. L., Vuchinich, S., Ji, P., Brechling, V., ... Flay, B. R. (2011). Effects of the Positive Action program on problem behaviors in elementary school students: A matched-pair randomized control trial in Chicago. Psychology & Health, 26, 187-204.

      Smokowski, P. R., Guo, S., Wu, Q., Evans, C. B. R., Cotter, K. L., & Bacallao, M. (2016). Evaluating dosage effects for the Positive Action program: How implementation impacts internalizing symptoms, aggression, school hassles, and self-esteem. American Journal of Orthopsychiatry. Advance online publication, http://dx.doi.org/10.1037/ort0000167.

      Snyder, F., Vuchinich, S., Acock, A., Washburn, I., Beets, M., & Kin-Kit, L. (2010). Impact of the Positive Action program on school-level indicators of academic achievement, absenteeism, and disciplinary outcomes: A matched-pair, cluster randomized, controlled trial. Journal of Research on Educational Effectiveness, 3(1), 26-55.

      Snyder, F. J., Vuchinich, S., Acock, A., Washburn, I. J., & Flay, B. R. (2012). Improving elementary school quality through the use of a social-emotional and character development program: A matched-pair, cluster-randomized control trial in Hawai'i. Journal of School Health, 82, 11-20.

      Washburn, I. J., Acock, A., Vuchinich, S., Snyder, F., Li, K. K., Ji, P., ... Flay, B. R. (2011). Effects of a social-emotional and character development program on the trajectory of behaviors associated with social-emotional and character development: Findings from three randomized trials. Prevention Science, 12(3), 314-323.

      Carol Gerber Allred
      264 4th Avenue
      Twin Falls, Idaho 83303-2347
      (800) 345-2974
      carol@positiveaction.net
      www.positiveaction.net

      Study 1

      Beets, M. W., Flay, B. R., Vuchinich, S., Snyder, F., Acock, A., Burns, K., ... Durlak, J. (2009). Use of a social and character development program to prevent substance use, violent behaviors, and sexual activity among elementary-school students in Hawaii. American Journal of Public Health, 99(8), 1-8.

      Snyder, F., Vuchinich, S., Acock, A., Washburn, I., Beets, M., & Kin-Kit, L. (2010). Impact of the Positive Action program on school-level indicators of academic achievement, absenteeism, and disciplinary outcomes: A matched-pair, cluster randomized, controlled trial. Journal of Research on Educational Effectiveness, 3(1), 26-55.

      Snyder, F. J., Vuchinich, S., Acock, A., Washburn, I. J., & Flay, B. R. (2012). Improving elementary school quality through the use of a social-emotional and character development program: A matched-pair, cluster-randomized control trial in Hawai'i. Journal of School Health, 82, 11-20.

      Washburn, I. J., Acock, A., Vuchinich, S., Snyder, F., Li, K. K., Ji, P., ... Flay, B. R. (2011). Effects of a social-emotional and character development program on the trajectory of behaviors associated with social-emotional and character development: Findings from three randomized trials. Prevention Science, 12(3), 314-323.

      Study 2

      Bavarian, N., Lewis, K. M., Acock, A., DuBois, D. L., Zi, Y., Vuchinich, S., ... Flay, B. R. (2016). Direct and mediated effects of a social-emotional learning and health promotion program on adolescent health outcomes: A matched-pair, cluster-randomized controlled trial. Journal of Primary Prevention, 37, 87-105.

      Bavarian, N., Lewis, K. M., DuBois, D. L., Acock, A., Vuchinich, S., Silverthorn, N., ... Flay, B. R. (2013). Using social-emotional and character development to improve academic outcomes: A matched-pair, cluster-randomized controlled trial in low-income, urban schools. Journal of School Health, 83(11), 771-779.

      Lewis, K. M., Bavarian, N., Snyder, F. J., Acock, A., Day, J., DuBois, D. L., ... Flay, B. R. (2012). Direct and mediated effects of a social-emotional and character development program on adolescent substance use. The International Journal of Emotional Education, 4(1), 56-78.

      Lewis, K. M., DuBois, D. L., Bavarian, N., Acock, A., Silverthorn, N., Day, J., ... Flay, B. R. (2013). Effects of Positive Action on the emotional health of urban youth: A cluster-randomized trial. Journal of Adoleslcent Health, 53, 706-711.

      Lewis, K. M., Schure, M. B., Bavarian, N., DuBois, D. L., Day, J., Ji, P., ... Flay, B. R. (2013). Problem behavior and urban, low-income youth: A randomized controlled trial of Positive Action in Chicago. American Journal of Preventive Medicine44(6), 622–630.

      Li, K. K., Washburn, I., DuBois, D. L., Vuchinich, S., Ji, P., Brechling, V., ... Flay, B. R. (2011). Effects of the Positive Action program on problem behaviors in elementary school students: A matched-pair randomized control trial in Chicago. Psychology & Health, 26, 187-204.

      Washburn, I. J., Acock, A., Vuchinich, S., Snyder, F., Li, K. K., Ji, P., ... Flay, B. R. (2011). Effects of a social-emotional and character development program on the trajectory of behaviors associated with social-emotional and character development: Findings from three randomized trials. Prevention Science, 12(3), 314-323.

      Hawaii Study (Beets et al., 2009; Washburn et al., 2011; Snyder et al., 2010, 2012)

      Evaluation Methodology

      Design: This evaluation was conducted in Hawaii, utilizing a clustered randomized design in one large school district. First and second grade students were followed from spring of 2002 through spring 2006 (grades 5 and 6) by the wave 5 follow-up. Elementary schools within that district, on the islands of Oahu, Maui, and Molokai, were eligible to participate in the design if they were public K-5 or K-6 schools and not academy, charter, or special education schools. Schools were also required to have at least 25% of population eligible to receive free or reduced price lunches, to be in the lower 3 quartiles in standardized test scores, and to have annual student stability rates over 80%. One hundred eleven of 151 elementary schools on these islands were eligible (73.5%).

      Schools were stratified by risk score, calculated considering 1) demographics such as school size, mobility of students, ethnicity, and student/teacher ratios, 2) student characteristics such as special education and gifted status, and 3) student behavior and performance such as disciplinary referrals, suspension rates, and standardized achievement scores. There were 19 resulting and usable strata, each containing 3 to 6 schools. Prior to recruitment, schools within these strata were randomly assigned to either the treatment or control condition. If a school declined to participate in its assigned condition, researchers attempted to replace the declining school with one of the remaining schools in the same strata. However, every time a school declined, the remaining schools in the stratum also declined. Eight strata were abandoned on Oahu because of such issues. This resulted in 10 matched pairs of schools, 5 on Oahu, 3 on Maui, and 2 on Molokai. Compared to schools in abandoned strata and to those who declined, participating schools were not significantly different on variables used to determine risk scores. Furthermore, the treatment and control schools were statistically similar to one another overall and did not significantly differ from the averages of the discarded sets. As planned, trial schools were of higher risk than the average of all Hawaiian public schools.

      Intervention schools were offered the complete PA program free of charge and control schools were offered a monetary incentive during the randomized trial and the PA program upon completion of the trial. Three of the 10 control schools chose to receive the PA program after the formal trial; they were treated as controls at the follow-up to the present study, as anecdotal evidence suggests that they did not fully implement the program, and it is likely that schools need several years to fully implement a comprehensive program to see substantial benefits.

      The study followed students who were in first or second grade at baseline (the 2001-2002 academic year) and who stayed in the study schools through fifth or sixth grade (the 2005-2006 academic year). Students who left the study schools during the study period were dropped, and students who joined study schools during the study period were added to the study. Thus, the study included students who entered the schools at any year during the course of the study and who were in fifth grade at the end of the study.

      Student self-reports of their behavior were collected at five time points, on each of two cohorts (first graders and second graders at the start of the project). Data were collected for baseline at the end of the academic school year in half of both the control and PA schools and at the beginning of the next school year in the others. The remaining four waves of data were collected at the next four springs. Researchers rather than school staff collected the data.

      Because some children changed schools, were sometimes absent for an administration of the questionnaire, or refused to answer selected items, there were missing data at all waves. For student reports of their own behavior, 1,544 students responded at the first wave, 2,116 at the second wave, 1,498 at the third wave, 1,493 at the fourth wave and 696 at the final wave. The sharp drop at the final wave was because 6 of the 20 schools (3 control and 3 PA) did not contain sixth grade and the entire older cohort in those schools was lost to follow-up. There were a total of 7,347 observations from 2,646 children distributed over 20 schools, with an average of 2.8 waves of data for each student.

      Beets et al. (2009) reported results from fifth grade students (aged 10-11 years). They were asked to obtain active parental consent and to provide verbal assent to respond to items asking about substance use, violent behavior, and sexual activity. This request garnered responses from 976 intervention students (50% girls) and 738 control students (50% girls), a response rate of 86%.

      Sample Characteristics: The sample was evenly split between boys (50.2%) and girls (49.8%). Program schools were 32.7% Hawaiian, 20.5% White, 1.4% Black, 0.2% Native American, 2.7% Pacific Islander, 4.3% Japanese, 3.7% other Asian, 12.4% other ethnicity, and 22.0% multi-ethnic. Control schools were 26.5% Hawaiian, 11.4% White, 2.0% Black, 0.3% Native American, 3.2% Pacific Islander, 11.8% Japanese, 5.1% other Asian, 14.4% other ethnicity, and 25.3% multi-ethnic. Eighty-eight percent of the population of both program and control schools were said to be stable in terms of mobility. Per capita income was about $14,000 for both program and control schools. Sixty-two percent of the population of the program schools were eligible for free or reduced price lunches, along with 56% of the population of control schools. About 11% of both groups were special education students and about 9% were non-English speakers.

      Measures: School-level data on daily absences, suspensions, retention in grade, and academic achievement indicators including grade 5 reading and math and grade 4 reading and math were available, as well as student self-report for attitudes toward positive behaviors and lifetime substance use and violence behaviors. Teachers also reported in years 4 and 5 on child behavior, including violence. In order to measure school quality change (Snyder et al., 2012), archival school-level data were obtained from the Hawaii Department of Education Accountability Resource Center Hawaii as part of the state's school quality survey (SQS) accountability system. These data were collected from teachers, parents, and students every 2 years. There were 9 indicators of school quality: 1) safety and well-being, 2) involvement of parents, 3) satisfaction of parents, students, and teachers, 4) quality of student support, 5) focused and sustained action, 6) responsiveness of the system, 7) standards-based learning, 8) professionalism and capacity of the system, and 9) coordinated team work.

      Analysis: Analyses vary according to the specific report examined, but include growth curve analysis conducted on the first 4 waves of data, two-level random effects models with students nested within schools, and Poisson models. For the school quality analysis, school quality composite scores (SQC) were created for teachers, parents, and students by calculating the average of all SQS indicators for each respondent group. Primary analysis included matched-paired t-tests, and sensitivity analysis was conducted using permutation tests with Stata v11 permute. Effect sizes were calculated using Hedges' adjusted g. A percent relative improvement (RI) was calculated to better interpret the effect size indicators.

      Outcomes

      Implementation Fidelity: On average, 85% of teachers completed implementation reports at the end of each unit. In the first year, 59% of the teachers reported completion of the expected 4 or more lessons of Positive Action each week and this increased to 71% by the third year. Implementation fidelity, overall, varied widely between schools, especially in the first year. By the third year, five schools were implementing at a high level of fidelity (though still not at full fidelity, meaning that very few schools achieved much in regards to the family- or community-involvement components). Three were implementing at a moderate-to-high level and the last two schools were still implementing at low levels of fidelity.

      Baseline Equivalence: The studies reported no significant baseline differences between conditions for the following list of variables.

      • Beets et al. (2009): the percentage of students receiving free or reduced-price lunch, school size, percentage of student stability, and student ethnic distribution; additional characteristics of the school (student-teacher ratio and expenditures per student); characteristics of student populations (proportions of gifted, special education, and English as a second language students); and indicators of student behavioral and school performance outcomes (disciplinary referrals, suspension rates, and standardized achievement scores).
      • Beets et al. (2009): self-reported and teacher-reported negative student behaviors such as gets into fights, threatens others, physically hurts others, and hits others (using random-effects models with students nested within schools).
      • Washburn et al. (2011): endorsement of positive behaviors (using a simple t-test and a simple linear growth model).
      • Snyder et al. (2010): 19 sociodemographic variables and 7 academic outcome variables.
      • Snyder et al. (2012): a school quality composite score from teachers, parents, and schools.

      In addition, a February 2012 response from Dr. Flay to questions from the Blueprints Board reported on baseline differences on 43 attitudinal variables, with only one significant effect emerging.

      Differential Attrition: The analysis of school-level outcomes in Snyder et al. (2010, 2012) had no attrition. The other two papers did not present a full analysis of differential attrition but offered relevant evidence.

      Beets et al. (2009) compared the negative behavior scale for control group students who began at baseline and participated in all five years of the study with those who entered the study after baseline. The results indicated no significant difference.

      Washburn et al. (2011) noted that there were missing data at each wave and that the loss of data at the last wave came largely from six schools without a sixth grade and absences of students on the day of the assessments. They also stated, “Given that parents, not students, usually decide if a student is in a school or not and, therefore, missingness is not related to the student behavior outcome (in only two cases was missingness significantly correlated with the outcome and in both cases the correlation was small, only -0.22 and -0.10) ...”

      Individual-Level Fifth Grade Outcomes (Beets et al., 2009): Individual-level analysis at the end of fifth grade using two-level models for students, but without baseline controls, showed that, relative to the control group, the intervention group had significantly lower substance use, violent behavior, and sexual activity according to self-reports, and lower violence according to teacher reports. Relative risk and odds ratios ranged from .54 for teacher-reported violence to .24 for self-reported sexual activity.

      School-level analysis of mean differences at the end of fifth grade showed significantly lower means for intervention schools on self-reported substance use and violent behavior and teacher-reported violent behavior (self-reported sexual activity and teacher-reported substance use were marginally significant).

      A dose-response analysis of students showed that program effects were larger for those students exposed to the program 3-4 years compared with those exposed for only 1-2 years.

      Individual-Level Growth Models in Endorsement of Positive Behaviors (Washburn et al., 2011): Three-level growth models of a scale of positive behaviors revealed a significant interaction between year and condition and between year squared and condition. The interactions showed that the number of positive behaviors endorsed decreased from year to year and this decrease was significantly lower in the intervention group. By the fifth wave, the sample means were 50.88 and 37.23 for the intervention and control schools, respectively. Cohen’s d for the final wave (controlling for baseline differences) was 0.46.

      School Academic Outcomes (Snyder et al., 2010): At posttest, the mean comparisons indicated that intervention schools had significantly higher math and reading scores, significantly lower absenteeism, and marginally fewer suspensions. After completion of the randomized trial, at one-year post trial as intervention schools continued to implement the program, reading TerraNova and math and reading scores were significantly higher among intervention schools; and absenteeism and suspensions were significantly lower for intervention schools. All of the effect sizes were moderate to large, regardless of the level of significance.

      Random intercept growth models largely confirmed the results for math scores, reading scores, and absenteeism, while they showed some significant intervention effects on suspensions and retention that were absent in the mean comparisons.

      School Quality Results (Snyder et al., 2012): By 1-year post-trial, school-quality composite (SQC) scores among all respondent groups participating in Positive Action increased significantly compared to scores among the control group schools (which actually exhibited decreases). In fact, SQC scores demonstrated by the PA schools exceeded even the statewide averages. Effect sizes were greatest among teachers (1.61) and also large among parent (1.26) and student (1.31) reports. Relative improvements on SQC scores among the intervention schools ranged from 16.2% to 21.1% across the three respondent groups. When analyzed individually, the majority of the 9 indicators of school quality were found to have improved across all respondent groups when compared to the reports of the control school participants. Almost all effect sizes were moderate to large. Time x condition interactions for teacher, parent, and student models were all statistically significant, also bearing support for the outcomes indicating significant improvement among intervention schools in school quality due to the Positive Action implementation.

      Chicago Study (Li et al., 2011; Lewis et al., 2012; Lewis, DuBois et al., 2013; Lewis, Schure et al., 2013; Bavarian et al., 2013; Washburn et al., 2011; Bavarian et al., 2016)

      Evaluation Methodology

      Design: This evaluation was conducted in the Chicago Public Schools system beginning in the fall of 2004. Four hundred eighty-three K-6 and K-8 schools were screened for eligibility on a number of criteria. Schools were excluded if they were not community schools (if they were academy, charter, or special education schools), if they were already using Positive Action or a similar program, if their enrollment rate was below 50 or above 140 students per grade, if their annual student mobility rates were 40% or above, if more than 50% of students passed the Illinois State Achievement Test, and if less than 50% of their students received free lunches. According to these criteria, 68 schools were invited to attend information sessions about the PA program and research study. Representatives from 36 of these schools attended one of the information sessions. Eighteen of the schools agreed to participate with the understanding that they would be matched with another suitable school in the pool and randomly assigned to conditions. Using a SAS program, they were matched into pairs on a range of variables, including ethnicity (percentage White, Black, Hispanic, and Asian), ISAT test scores, attendance rate, truancy rate, socioeconomic variables, percentage of students who enrolled or left school during the school year, number of students per grade, percentage of parents reported to demonstrate school involvement, and percentage of teachers employed by the school who met minimal teaching standards. Using this process, the best seven of nine well-matched pairs of schools were selected, and thus four schools were dropped, leaving 14 schools. These 14 schools were not significantly different from the 68 eligible schools on any of these measures, nor were treatment and control schools different. The study followed a single cohort of students who were in the third grade when the program implementation began. New students joining this cohort in subsequent years were also included and followed.

      Each member of the seven pairs was randomly assigned to either the treatment or a waitlist-control condition (who would receive the program after three years). The treatment condition schools began program implementation in the 2004-2005 school year (and the waitlisted control schools were to begin implementation in the 2007-2008 school year, but were then asked to hold off for another 3 years so that the study could continue through grade 8). During Phase I of the funding, five assessments were conducted at baseline (Fall 2004), Spring 2005, Fall 2005, Spring 2006 and Spring 2007 (at the end of grade five). Every student who was present in the study schools at each wave of data collection was assessed but students who moved out of the study schools were not tracked. As such, the sample is slightly different at each assessment point. At the end of the three-year study period, approximately 510 fifth graders completed the questionnaires; slightly more than half of these students (290, 56.9%) were part of the original sample of 590 students at baseline. Li et al. (2011) found no significant difference on baseline problem behaviors between stayers and dropouts across the multiple imputations. Also, the researchers stated via email communication that there was no differential dropout or addition by group condition.

      Sample Characteristics: Students in program schools were, on average, 10.23% White, 52.48% Black, 32.24% Hispanic, and 7.18% Asian American. Thirty-four percent met minimal state achievement test criteria and 89% qualified for free lunch. Students in control schools were 11.73% White, 55.35% Black, 28.62% Hispanic, and 4.14% Asian American. Thirty-four percent met minimal criteria on state achievement tests and 91% were eligible for free lunches. 

      The sample of fifth graders with data at baseline (n=290) were primarily African American (46.4%); the remaining groups were 27% Hispanic, 6.9% White non-Hispanic, 2.8% Asian, and 17% other or mixed. There were 49% girls in the control group and 51% in the intervention group. 

      Measures: Across the eight waves, outcome measures covered behaviors, values, attitudes, academics, and school discipline. Unless otherwise noted, the measures come from youth self-reports.

      Lifetime prevalence of substance use (Li et al., 2011; Lewis et al., 2012). Items asked if subjects had ever (1) smoked a cigarette, (2) drank alcohol, (3) gotten drunk on alcohol, (4) used marijuana, or (5) used other more serious drugs. A count variable measured the number of substances ever used. Alphas ranged from .71 to .79 across the waves.

      Lifetime prevalence of serious violence-related behavior (Li et al., 2011; Lewis et al., 2013).  Starting at the end of 5th grade, items asked if subjects had ever (1) carried a knife, (2) threatened to cut or stab someone, (3) cut or stabbed someone on purpose, (4) been asked to join a gang, (5) hung out with gang members, or (6) been a member of a gang. A count variable measured the number of behaviors ever having engaged in. Alphas ranged from .74 to .82 across the waves.

      Bullying (Li et al., 2011; Lewis et al., 2013). Bullying behaviors were measured using six items selected from the Aggression Scale. Children responded to how often in the past 2 weeks they had engaged in verbal or physical aggression at school (e.g., teased others, shoved others). Responses ranged from 0 (never) to 3 (many times) and, after dichotomizing into ever and never, were summed to create a count variable. Alphas ranged from .83 to .90 across the waves.

      Disruptive behaviors (Li et al., 2011; Lewis et al., 2013). Children were asked to respond to six items about how often in the past couple of weeks they had engaged in different problem behaviors at school (e.g., taking something at school that belonged to others, skipping class). Responses ranged from 0 (never) to 3 (many times) and, after dichotomizing into ever and never, were summed to create a count variable. Alphas ranged from .77 to .81 across the waves.

      Social-Emotional and Character Development (SECD) (Lewis et al., 2012). The 28-item Child SECD Scale was adapted from multiple existing measures of social skills. An average composite score of the 28 items was created for each of the eight waves, where higher scores indicate higher SECD skills. Example items are: "I try to cheer up other kids if they are feeling sad", "I apologize when I have done something wrong", “I speak politely to my teacher”, “I keep my temper when I have an argument with other kids”, “I listen (without interrupting) to my parents”, and "I follow school rules". Responses to these items on a 4-point scale allowed students to indicate how often they performed each SECD-related behavior (1= none of the time; 2= some of the time; 3= most of the time; and 4= all of the time). Alphas ranged from .88 to .92 across the waves.

      Normative beliefs supporting aggression (Lewis et al., 2013). Students answered questions adapted from the Normative Beliefs About Aggression Scale, which has established reliability and validity for school-aged children. Eight items (e.g., Is it ok or wrong to hit, shove, yell, fight other people?) were rated on a 4-point scale (really wrong to perfectly ok) and averaged to create a composite score, with higher scores reflecting the belief that aggression is more acceptable (alpha range .81–.93). Given a skewed distribution of responses, the scale score was split for analysis using a median split across all waves.

      Parent-reported bullying (Lewis et al., 2013). Parents responded to six items (alpha range .73–.83) regarding bullying (e.g., hits others, teases, threatens to hurt others) in the past 30 days. The items used a 4-point scale (never to almost always) but were dichotomized and converted to a count. The outcome was assessed at Waves 1–5 and Wave 8.

      Parent-reported conduct problems (Lewis et al., 2013). Parents responded to seven items (alpha range .74–.81) regarding conduct problems (e.g., truancy, cheating, stealing) in the past 30 days. The items used a 4-point scale (never to almost always) but were dichotomized and converted to a count. These outcomes were assessed at Waves 1–5 and Wave 8.

      Disaffection with learning (Bavarian et al., 2013). Four items from a measure of student engagement used a four-point Likert scale (“Disagree A LOT” to “Agree A LOT”) and statements such as “When I’m in class, I think about other things” and “When I’m in class, my mind wanders”. A composite with high scores reflecting more disaffection had alphas ranging from .64 to .71 across the eight waves.

      Academic grades (Bavarian et al., 2013). Students were asked, “What grades have you been getting this school year?” Response options ranged from 1 to 9 (e.g., 1 = Mostly Fs, 4 = mix of Cs and Ds, and 9 = Mostly As).

      Teacher-assessed academic ability (Bavarian et al., 2013). Teachers rated students on reading, mathematics, academic performance, and intellectual functioning using a 5-point Likert scale (1 = Far below grade level to 5 = Far above grade level). A composite score indicating higher ratings of students’ academic ability had alphas ranging from .97 to .98.

      Teacher-assessed academic motivation (Bavarian et al., 2013). A single-item measure used response options ranging from “Extremely low” to “Extremely high”.

      School-level disciplinary referrals (Lewis et al., 2013). School-level aggregated data reported on the school district’s website were accessed for school years 2002/2003 to 2009/2010. Disciplinary referrals were based on a range of disruptive, bullying, and illegal student behaviors, the latter of which included (but were not limited to) vandalism, assault, theft, and possession of drugs or dangerous weapons. Analyses on school-level data were adjusted for school size by including it as an exposure variable in the model.

      School-level suspensions (Lewis et al., 2013). The measure came from the same source and used the same procedures as for disciplinary referrals.

      School-level standardized reading scores (Bavarian et al., 2013). Archival reading scores of non-English Language Learners came from a standardized, school-administered, statewide test (the ISAT). A single weighted average of the percentages of students falling into four achievement levels was created for each school overall and by demographic subgroups. A value-added metric index of ISAT performance reported by the school district was used to control for the prior year ISAT scores of students. Data were available for the cohort transitioning from grades 7 to 8 (2009-2010).

      School-level standardized math scores (Bavarian et al., 2013). The standardized math scores also came from the ISAT.

      School-level absenteeism rates (Bavarian et al., 2013).  The school district reported average daily attendance rates for each school on a scale from 0 to 100%; these statistics were converted to a measure of average daily absenteeism by subtracting 100 from each school’s respective year-end attendance.

      Positive behaviors (Washburn et al., 2011). A total of 51 behavior items were asked, each with the same four response options: “none of the time,” “some of the time,” “most of the time,” or “all of the time.” The options were coded 1 for or “all of the time” and 0 otherwise, summed, and transformed into a percentage of maximum possible score.

      Positive affect (Lewis, DuBois, et al., 2013). Positive affect was measured using a modified 6-item version of the Positive and Negative Affect Scale for Children. Students reported the extent to which they had experienced each type of feeling (e.g., excited, happy) in the last 2 weeks using a 4-point scale ranging from "None of the time" (1) to "All of the time" (4). Alphas ranged from.70 to .87 across time points.

      Life satisfaction (Lewis, DuBois, et al., 2013). The measure consisted of 3 items: "My life is just right", "I have a good life", and "I have what I want in life". Students indicated how much they agreed with each statement on a 4-point scale ranging from "NO!" (1) to "YES!" (4). Alphas ranged from .71 to .84 across time points.

      Depression (Lewis, DuBois, et al., 2013). Assessed at waves 5 through 8, the measure used six items from the Behavior Assessment System for Children, such as “I feel depressed.” Alphas ranged from.70 to .79.

      Anxiety (Lewis, DuBois, et al., 2013). Assessed at waves 5 through 8, the measure used six items from the Behavior Assessment System for Children, such as “I often worry about something bad happening to me.” Alphas ranged from .70 to .79.

      Healthy food consumption and exercise (Bavarian et al., 2016). The multi-item scale available at all waves averaged responses to questions about how much of the time they “eat fresh fruits and vegetables”, “drink or eat dairy products”, and “exercise hard enough to sweat and breathe hard.” The items, which allowed for responses ranging from 1 = none of the time to 4 = all of the time, loaded together in a factor analysis.

      Unhealthy food consumption (Bavarian et al.,2016). The multi-item scale available at all waves averaged responses to questions about how much of the time they “eat junk food (chips and candy), “eat fast food”, and “drink soda pop.” The items, which allowed for responses ranging from 1 = none of the time to 4 = all of the time, loaded together in a factor analysis.

      Personal hygiene (Bavarian et al., 2016). The multi-item scale available at all waves averaged responses to the following statements: “I wash my hands after using the toilet”, “I brush my teeth at least twice a day”, and “I cover my nose and mouth when I sneeze”. The items, which allowed for responses ranging from 1 = none of the time to 4 = all of the time, loaded together in a factor analysis.

      Consistent bedtime (Bavarian et al., 2016). For this single-item measure, students at all waves rated how much of the time they “go to bed by 9:00 PM on school nights.” The measure was re-categorized as dichotomous, (0 = Not all of the time and 1 = All of the time) based on research suggesting school-age children need 10-11 hours of sleep each night.

      Body Mass Index (BMI) percentile (Bavarian et al., 2016). Measures of weight and height taken by researchers were used to calculate BMI percentiles, which were then classified as underweight, normal, at risk for overweight, and obese. Due to small sample size (N = 3 students), the underweight category was not included in the analysis, and a dichotomous outcome, unhealthy BMI percentile was created for analyses, with 0 = healthy weight and 1 = overweight or obese (i.e. BMI percentile ≥  85%).

      Analysis: Outcomes related to problem behaviors (Li et al., 2011) were analyzed using multiple imputation to handle the missing values on the covariates and baseline problem behaviors. The strategy of "multiple imputation, then deletion" (MID) used all cases for imputations but then deleted cases with imputed outcome variables. Students who joined the schools after the beginning of the study accounted for the highest proportion of the imputed values on baseline problem behaviors (41.5%) in the analyzed data. To test for baseline differences, multilevel analyses were conducted to examine whether cohort (grade 3) students in the intervention schools were different from students in the control schools on demographics and baseline behaviors. The analyses on baseline problem behaviors were run with and without imputed data. To test for differential attrition, multilevel regressions of baseline problem behavior on stayer versus dropout group membership were conducted using multiple imputed datasets.

      To examine program effects, multilevel Poisson models were used with and without imputed data. The three-level models included students (level 1) nested within schools (level 2) nested within pairs (level 3). Introducing a third level random effect partitioned the between-pair variation from the within-pair variation; hence, intervention effects could be tested with greater precisions given that the pairs were well matched. The significance of the program effects in the multilevel models was tested against a standard normal distribution, which assumes a sufficiently large number of schools. Sensitivity analyses using an adjusted degrees of freedom (d.f.=12) were conducted to provide more conservative tests of the program effects for each outcome behavior.

      Outcomes related to substance use (Lewis et al., 2012) were analyzed for program effects using three-level multilevel growth curve models (with time nested within individuals, and individuals nested within schools) that controlled for baseline values, and for mediation of program effects using structural equation models. Missing values were handled with Full Information Maximum Estimation, and standard errors were estimated using bootstrap procedures. The models did not control for baseline outcomes, as asking about substance use in grade 3 would have little value. Tests of significance were done at the school-level and based on the sample size of 14, while effect sizes were calculated from the student-level analysis.

      Outcomes related to academics (Bavarian et al., 2013) were analyzed using three-level multilevel growth curve models (with time nested within individuals, and individuals nested within schools) for student-level data and two-level multilevel growth curve models for school-level data (time nested within schools). All analyses controlled for baseline values. The models treated condition as a school-level variable. Full information maximum likelihood estimation used all valid observations to model school differences and tested for group-by-time and group-by-time squared interactions. Because of the small Ns (7 per condition), one-tailed p-values were used in tests of effects of the program on school-level outcomes. Gender and student mobility were examined as moderators.

      Outcomes related to problem behaviors (Lewis et al., 2013) were analyzed using three-level multilevel growth curve models for student- and parent-reports (time nested within individuals, and individuals nested within schools) and two-level multilevel growth curve models for school-level data (time nested within schools) that controlled for baseline values. The models treated condition as a school-level variable. To handle missing data, full information maximum likelihood estimation was used with logistic regression for binary outcomes and Poisson regression for count outcomes. Gender and student mobility were examined as moderators for student- and parent-reported measures.

      Outcomes related to emotional health (Lewis, DuBois, et al., 2013) were analyzed using either growth curve models of change over time or random intercept models of wave 8 outcomes. The models for depression and anxiety (measures available for waves 5 to 8) used the school-level average of student-reported negative affect as a baseline control. Mediated effects were analyzed using structural equation models, with missing values handled by Full Information Likelihood Estimation. Because the structural equations models could not incorporate clustering, supplementary analyses estimated three-level models to account for within-school clustering.

      Outcomes related to health behaviors (Bavarian et al., 2016) were analyzed with random-time coefficients in structural equation models. The structural equation models could not be estimated in a multilevel framework, but sensitivity analyses were done that included three levels and tests for the condition by time interaction without structural equation modeling. Multiple imputation was used for the model of BMI percentile.

      Outcomes

      Attrition and Baseline Group Equivalence: PA and control schools were not significantly different on any baseline measures, including child-, parent-, and teacher-reported measures. Differences on ethnicity composition were significant; there were more African-American students and fewer students in the other/mixed ethnicity group for the control condition compared to the PA condition. Multilevel regression analyses, controlling for demographic variables and clustering of students within schools and schools within pairs, showed that PA students reported marginally higher rates of problem behaviors at baseline than control students using the original data (i.e., including stayers only; n~290 with valid responses on baseline problem behaviors). For multiply-imputed datasets (including stayers and newcomers; n~510, the difference was nonsignificant. To control for potential bias, baseline problem behaviors as well as demographic variables were controlled for in the following analyses.

      Across the multiple imputations, no significant difference on baseline problem behavior was found between stayers and dropouts controlling for demographic variables and clustering of students within schools and within pairs. Comparisons between stayers and newcomers in the control group on the behavior outcomes also showed nonsignificant differences (substance use, incidence ratio, violence, bullying behaviors and disruptive behaviors). Results suggested that stayers endorsed fewer items on problem behaviors than dropouts and newcomers, although none of the differences were statistically significant.

      During the final wave, differences on ethnicity composition were found between the program and the control groups, consistent with the sample characteristics at baseline.

      Response rates for teacher surveys ranged from 85 to 100% at all collection points while rates for parent reports were 93% in the first Wave, 77% for Wave 2, 76% for Wave 3, and 72% for Wave 4. (It should be noted that reports of data from baseline through Wave 5 do not include parent- or teacher-reported data.) As for students themselves, researchers provide information on all students who entered study schools at baseline or at any point over the study period, as analysis included all youth who were present in study schools at each wave. In addition to the 593 youth who were present at baseline (47% retained at Wave 5), 88 youth entered at Wave 2 (27% retained at Wave 5), 100 entered at Wave 3 (55% retained at Wave 5), 52 entered at Wave 4 (56% retained at Wave 5), and 108 entered at Wave 5. Thus, 941 students completed at least one local-site survey. Five hundred twelve students participated in the Wave 4 assessment (61% of the 833 who ever participated up to that point) and 500 (53%) participated in the Wave 5 assessment. There is no mention in the report of analyses conducted at Wave 4 or Wave 5 to determine if the program and control groups were still equivalent, but email communication with researchers states that there was no differential dropout or addition by group condition. From waves 6 to 8, there were 392 leavers and 240 joiners (Lewis et al., 2013, supplementary materials). By Wave 8, 363 students (including 131, or 21%, of the original cohort students) remained in the study, reflecting changes in school sizes, consent rates, and the high mobility rate of this population. There were 1170 students with data for at least one wave across all eight waves.

      Implementation Fidelity
      : Schools varied widely in how well they implemented the program. Implementation fidelity measures included teacher-reported amount and quality of classroom PA activities and teachers' perceived effectiveness of and attitudes towards the program, which were measured weekly, by unit, and yearly. Additional measures included year-end school administrator reports and mid-year and year-end student reports of exposure to program activities and attitudes towards the program. By the end of year 3, one school was implementing at a low level (scoring 50% or less on all implementation measures), while 4 were implementing at a moderate level (between 50% and 60%), and two were implementing at a moderate to high level (between 60% and 70%).

      Li et al. (2011): After three years of the PA program, results using the MID approach showed students in the PA schools endorsed significantly fewer items for substance use, serious violence, and bullying behaviors. PA students also reported engaging in fewer disruptive behaviors, although this was not statistically significant. The positive program effects can be translated into 31% reduction in substance use behavior, 36% reduction in violence behavior, 41% reduction in bullying behaviors, and 27% (not significant) reduction in disruptive behaviors.

      Lewis et al. (2012): The study found that intervention students had significantly lower scores on the substance use composite scale at Wave 8 (sensitivity analyses treating the outcome as a count variable were similar). In addition, the intervention had a significant effect on the slope of the Social and Emotional Character Development Scale (SECD); the slope of scale decreased over time, but the intervention significantly mitigated this decline.

      Mediation analysis showed that the intervention had a significant indirect effect on the substance use composite through the SECD scale and that the mediation by SECD eliminated the direct effect of the intervention on the substance use composite (i.e., complete mediation).

      Bavarian et al. (2013): The study reported statistically significant program effects for disaffection with learning (one of the two student-report measures; effect size=-0.19; p < .01 two-tailed) and significant effects for academic motivation (one of the two teacher-report measures; effect size=0.39, p < .05 two-tailed). However, the benefit of the intervention for disaffection with learning had disappeared by the end of the study. No significant differences emerged for measures of grades or teacher rated academic performance. Analysis of school-level data revealed marginally significant lower absenteeism in program schools versus control schools (effect size=-0.78, p < .05 one-tailed).

      For all students, marginally significant results were found for school-based reading test scores at grade 8 but not for math test scores. African American males improved significantly more on reading (effect size=1.5, p < .05 two-tailed).

      Gender appeared to moderate the effects of Positive Action on teacher-rated academic ability with program effects being larger for boys than girls.

      Lewis et al. (2013): The condition-by-time interaction coefficients revealed statistically significant program effects for six of the eight problem-behavior measures examined. Three of the four youth-reported measures – normative beliefs supporting aggression (effect size=-0.68, p < .01), bullying (effect size=-0.39, p < .01), and, disruptive behaviors (effect size=-0.50, p < .05) – improved significantly more for program schools than control schools. Gender moderated youth-reported problem behavior with girls demonstrating greater improvement than boys.

      Changes in one of the two parent-reported measures, bullying (p<.05), were positively affected by the program. Gender again moderated parent-reported problem behavior but with boys demonstrating greater improvement than girls.

      Both school-level outcomes, disciplinary referrals and suspensions, were significantly improved when comparing program schools and control schools (p<.01).

      The analysis of the two teacher measures showed no significant program effects.

      The results in Table 2 of the article also showed significant interactions of condition by time squared, which may reflect convergence of the conditions over time. To check, Table 3 compared predicted probabilities of the conditions at Wave 8 (though without significance tests). For outcomes with a significant interaction of condition by time squared, the effect sizes at Wave 8 were -.37 for bullying, -.58 for school-level disciplinary referrals, and -.27 for school-level suspensions.

      Washburn et al. (2011): The study found a significant year by condition interaction that indicated a positive program effect on students' positive behaviors. Children in intervention schools had a mean score of 63.53 at baseline and children in control schools had a mean of 67.55. By the final wave of the study, children in control schools had a mean score of 39.71, and children in intervention schools had a higher mean of 43.52. Cohen’s d at the final wave, controlling for baseline differences, was 0.41.

      Lewis, DuBois, et al. (2013). The study of emotional health used all eight waves of data for outcomes of positive affect and life satisfaction and used Waves 5 through 8 for outcomes of depression and anxiety. For positive effect, the intervention by time interaction was marginally significant (p < .10; d = .17). For life satisfaction, the intervention by time interaction and intervention by time-squared interaction were significant, which produced “a notable difference at study endpoint (d = .13) that favored students in PA schools.” For depression and anxiety, intervention students reported significantly fewer symptoms at the Wave 8 endpoint (p < .05; d = -.14 and p < .001; d = -.26, respectively).

      In addition, mediation analyses showed significant indirect effects of the intervention on each of the four outcomes via the measure of social and emotional character development. The indirect effects occurred for the time slope for positive affect and life satisfaction and for the Wave 8 endpoint for depression and anxiety.

      Bavarian et al. (2016). The study of health behaviors showed that the program had a significant negative effect on unhealthy food consumption in the structural equation model but not in the sensitivity checks. It had a marginally significant positive effect on healthy food consumption and exercise. It did not have a significant effect on personal hygiene, but the sensitivity check showed a marginally significant positive effect. The program had no effects on consistent sleep or healthy BMI.

      The mediation analyses showed significant indirect effects of the intervention via social and emotional character development on healthy food and exercise and personal hygiene.

      Mathematical Analysis: An independent analysis of the same data by Mathematica found no significant effects. In a letter to the Blueprints Board, Dr. Flay says that the poor results came from 1) evaluation of Positive Action along with six other programs, 2) low statistical power stemming from the small number of schools, 3) examination of only the subset of outcome variables and years available from the full study, and 4) use of models assuming a normal distribution of outcomes. Thus, using different measures, matching analytic techniques to the distribution of the outcomes, and extending outcomes to grade 8 produced positive findings.

      Flay, Allred, and Ordway (2001)

      Evaluation Methodology;

      Design: This study used retrospective school-level archival data to employ a matched-control research design. Two school districts, one in Nevada and one in Hawaii, both with eight or more elementary schools that had implemented PA for three or more years and had easily-available school-level archival data (e.g., student performance and disciplinary referrals/actions), were chosen for the evaluation. Two matched control schools for each program school were selected - matched on the following demographics: percent free/reduced lunch, mobility rates, and ethnic distributions. Data from all (matching and non-matching) non-PA schools were also included in a third category for analysis.

      Sample: In the Nevada school district there were 12 PA schools, 24 matched control schools, and 87 non-PA schools (including matched controls). In the Hawaii district there were 8 PA schools, 16 matched control schools, and 117 non-PA schools (including matched controls). The PA and matched control schools in both districts were comparable on all available variables. In Nevada, the PA schools were also similar to all non-PA schools. However, in Hawaii, the PA schools, compared to all non-PA schools, had higher percentages of Japanese/Chinese students, lower percentages of White and Hawaiian students, higher rates of mobility, and lower percentages of students receiving free/reduced lunch.

      Measures: School archival data consisted of standardized test scores and disciplinary reports. In Nevada, achievement scores were the average of the 1995-96 and 1996-97 district level Grade 4 percentile ranks on the TerraNova Comprehensive Test of Basic Skills. Disciplinary data consisted of reports of incidents of student-to-student violence, student-to-staff violence, and possession of weapons, for the same two years. Absenteeism rates were also analyzed. In Hawaii, achievement data consisted of the percent of students scoring above average on the Stanford Achievement Test for three school years (1994-95, 1995-96, and 1996-97). Hawaii reports disciplinary data in four categories: felonies, misdemeanors, department rules, and school rules. Number and rates of suspensions and absenteeism rates were also analyzed for Hawaii. Because preliminary analyses found no significant differences across years of data, data across years was combined for all reported analyses.

      Analysis: Analysis of covariance was conducted, using the stratifying variables as covariates. For achievement data, multivariate analysis was used to determine if there were overall effects, then univariate analyses. For disciplinary data, independent tests were conducted. In all cases, tests for interactions of condition (program or not) with the covariates were employed.

      Outcomes

      Multivariate tests of condition on achievement using one-tail tests of significance, controlling for percent free/reduced lunch, student mobility and percent African American students yielded significantly higher results for the PA schools in Nevada, compared to the matched controls. The three covariates were all significant. In addition, univariate tests on school achievement showed significantly higher results for the PA schools compared to their match controls on the following measures: math, reading, and language. For comparisons with all non-PA schools, condition was not significant in the multivariate analysis. For violence data in Nevada, significant program effects were observed for comparisons with all schools and the comparisons with the matched controls for student-to-student and student-to-staff violence and for total number of incidents and incidents per 1,000 students. Marginally significant effects favoring the treatment schools were observed for possession of weapons in the matched control comparison. There were no significant results regarding rates of absenteeism.

      In Hawaii, three covariates were significant predictors in the multivariate ANOVA for achievement when comparing PA schools with all other schools: parent education, percent free/reduced lunch, and percent Japanese/Chinese. Condition was also significant and univariate tests showed significantly higher scores in math, reading, and a combined score that was significantly higher for the PA schools, compared to all others. When comparing PA schools with matched controls, the results were parallel with the exception of parent education. For disciplinary data in Hawaii, all indicators (e.g., felonies, misdemeanors, department rules, school rules, and total incidents) were significantly different when PA schools were compared to non-PA schools, and all but misdemeanors were significant when compared with matching controls. Number and rates of absenteeism were also lower for PA schools, when compared to matching controls and all non-PA schools.

      Flay and Allred (2003)

      Evaluation Methodology

      Design: This study used retrospective school-level archival data to employ a matched-control research design. The study selected one large southeastern school district that had school-level archival data on student performance and disciplinary referrals/actions easily available for both elementary and secondary schools and that had a significant number of schools that had implemented PA for 4 or more years. Some schools in the district had never used PA or had stopped using it 4 or more years prior to 1997-98 school year (non-PA, n=28). Others had used it for 4 or more years prior to 1998 (PA-only, n=45), and others had also adopted other supplementary character/behavior programs, in addition to continued use of PA (PA+Other, n=20). Each of the latter two groups had used PA for an average of 7 years (range = 4-9 years). These three groups of schools were compared to assess program effects on elementary school student achievement and behavior.

      School report card (SRC) data were used to find matching sets of one PA-only school, one PA+Other school and one non-PA control school. Schools were matched by size, percent free and reduced lunch, percent mobility, and ethnic distribution. There were no significant differences between PA-only schools and PA+Other schools. PA schools were not different from the matched control (non-PA) schools on any matching variables. (Retrospective analysis also determined that they were comparable on one of the outcome variables -- academic test scores – thus making this study of stronger design than Study 3 above.) However, PA schools were substantially different than all non-PA schools. PA schools were at lower risk because they had lower proportions of students receiving free/reduced lunch, lower mobility rates, and lower proportions of minority students, but they were at higher risk because they were larger and had a higher student-teacher ratio. Since the evaluators found no significant differences between the PA-only and PA+Other schools on outcomes, these two conditions were combined for the analyses.

      For analyses of the sustained effect of PA into middle schools, the proportion of feeder elementary schools that had implemented PA for at least the prior 4 years was calculated. For analyses of sustained effects of PA into high school, the proportion of feeder schools that had implemented PA for at least the 8 years prior was calculated. In each case, the evaluators tried to ensure that students in the middle or high schools would have received at least two years of PA. Each of the 33 middle schools in the district was categorized as low PA (less than 60% of their students being PA graduates), medium PA (60-79% PA graduates), and high PA (80%-100% PA graduates). The 18 high schools in the district were also categorized based on the percentage of their students having graduated from PA during elementary school: low PA (0-15% PA graduates), medium PA (16%-26% PA graduates) and high PA (27-50% PA graduates). None of the high schools had more than 50% of their students coming from elementary schools with PA.

      Sample: The matched PA and controls schools averaged 62-68% students receiving free or reduced-price lunch, and had 45-51% white students, 25-28% African American students, and 21-23% Hispanic students. About 37% of the students were reading above the median on normed tests, about 16% were writing above the median, and 45% were performing math above the median.

      Measures: Elementary achievement data consisted of mean scores on the Florida Reading Test and the Grade 4 Florida Comprehensive Aptitude Test (FCAT) for the 1997-98 school year. Behavioral data consisted of disciplinary referrals for incidents of violence per 100 students, percent of students who received out-of-school suspensions, and percent of students absent for 21 or more days during the school year.

      Middle-school standardized achievement test data were the percent of students scoring above the median on the 8th grade norm referenced tests (NRT) of reading and math (1997-1998). Indicators of behavior included incidents per 100 students of substance use, violence, dissing behaviors (disrespect, disobedience, disorderly, and disruptive), and property crimes. All behavioral data were coded disciplinary referrals by school principals or disciplinary officers. Absenteeism data were also available.

      High school standardized achievement test data (1997-98) were the percent of 10th grade students scoring 3 or greater on the Florida Writes test, percent of seniors passing the High School Competency Tests (HSCT) of communications and math, mean Scholastic Aptitude Test (SAT) scores, and mean American College Testing (ACT) composite scores. Percent absent 21 or more days and percent dropout were other indicators of school involvement. Behavioral data (1998-99) included disciplinary referrals for substance use (tobacco, alcohol, and illicit drugs), violence (threatening, fighting, carrying weapons, and battery), dissing behaviors, sexual behaviors, property crime, breaking of school rules, misbehavior on or near school buses, parking violations, and falsification of reports. Data on percent of students suspended (separately for in-school and out-of-school) was also included.

      Analysis: Multivariate general linear modeling (GLM) with fixed effects for condition and pair number was conducted for the comparison of matched PA and non-PA schools. To estimate the effects of receiving PA in elementary school on achievement and behavior in middle and high school, the evaluators conducted GLM for each set of outcomes using percent PA students as the independent variable and using percent free/reduced lunch, school size, and percent mobility as covariates.

      Outcomes Posttest and Long-Term:

      Elementary School Results - Scores on the reading test and FCAT were significantly higher for schools receiving PA in both the all-schools analysis and for those in the matched control analysis. The number of violent incidents per 100 students was significantly lower in PA schools than comparison schools in the all-schools analysis, and in the matched-school analysis. The percentage of students receiving out-of-school suspensions was significantly lower for the matched-schools analyses and marginally lower for the all-school analyses. No significant effects were reported for absenteeism.

      Middle School Results - For all outcomes, middle schools with more PA graduates scored significantly higher than schools with fewer PA graduates. These results included reading scores, math scores, behavioral incidents per 100 students for drug use, violence, disorderly conduct, and property crime.

      High School Results - Significant outcomes suggesting favorable program results were reported for the following indicators: percent scoring greater than 3 on Florida Writes, percent passing HSCT tests of communications, mean SAT scores, percent continuing education, percent school drop-out, substance use, violence, sexual behavioral problems, dissing behaviors, absent over 21 days, and percent in- and out-of-school suspensions. No significant effects occurred for behaviors related to property crime, school rules, busing, and parking. For all significant outcomes, there was a clear dose-response relationship with higher dosage schools experiencing better outcomes.

      Outcomes - Brief bullets:

      • Scores on the reading test and FCAT were significantly higher for elementary schools receiving PA in both the all-schools analysis and the matched control analysis.
      • The number of violent incidents per 100 students was significantly lower in PA elementary schools than comparison schools in the all-schools analysis and in the matched-schools analysis.
      • The percentage of students receiving out-of-school suspensions was significantly lower for PA schools in the matched-schools analysis and marginally lower for PA schools in the all-schools analysis.
      • For all outcomes, middle school students with more PA graduates scored significantly better than schools with fewer PA graduates. These results included reading scores, math scores, behavioral incidents per 100 students for drug use, violence, disorderly conduct, and property crime.
      • Significant outcomes suggesting favorable program results were reported for the following indicators for high schools with a higher percentage of students that received PA: percent scoring greater than 3 on Florida Writes, percent passing HSCT tests of communications, mean SAT scores, percent continuing education, percent school drop-out, substance use, violence, sexual behavioral problems, dissing behaviors, absent over 21 days, and percent in- and out-of-school suspensions.

      Generalizability: Since probability-sampling techniques were not used to select the PA program schools, it is unclear whether these schools are representative of elementary schools in the district in which they reside or of elementary schools in general.

      Southeastern State Trial (Washburn et al., 2011)

      Eight rural public elementary schools, with five age cohorts ranging from kindergarten to grade 4, were matched and randomized (no details available regarding randomization). Seven school-level variables were available at baseline which indicated no differences between treatment and control schools. There were 1,652 students in the first wave, 1,944 in the second, and 1,504 students at the third wave. The frequency of positive behaviors associated with character was asked at the end of the first through third years; this is the only outcome measure. There was no pretest. The trajectories of children were compared from the first year of intervention through the end of the third year of intervention using a multi-level, growth-curve analysis. There was a significant intervention effect with Positive Action mitigating the decline in the endorsement of positive behaviors by students.

      Preschool Study (Flay, 2012)

      This evaluation focused on a version of the program for preschool children rather than, as in the other studies, on the program for older children.

      Evaluation Methodology

      Design: Using a convenience sample of three preschools in Virginia and 12 classrooms/instructors, the study randomly assigned students to classrooms/instructors that had or had not previously been assigned (non-randomly) to offer Positive Action lessons. The intervention students received a condensed version of the Positive Action pre-K program (60 daily 15-20 minute lessons).

      A pretest done in September was followed by a posttest in December-January. At pretest, 12 instructors from 3 sites rated 146 students, while at posttest, 11 instructors from 2 sites rated 163 students. The study analyzed those students who completed both pretest and posttest, leaving 135 students (92%) from 11 instructors.

      Sample Characteristics: Not available.

      Measures: Classroom instructors completed a web-based rating of their students on 33 items corresponding to the domains addressed by the program. Responses ranged from 1 (not at all like this student) to 7 (very much like this student), with high scores representing better behavior. Eleven scales were constructed from the responses: understanding PA, self-concept, physical health, intellectual health, self-management, self-control, respect, consideration, social bonding, honesty, and self-improvement. Alpha coefficients ranged from .758 to .933. In addition, a total scale constructed from all the items had an alpha coefficient of .978.

      Analysis: Results presented t-tests for the mean differences between the change for the control group and the change for the intervention group (i.e., the difference of differences).

      Outcomes

      Implementation Fidelity: Four of six instructors reported that they delivered nearly all the lessons. Students received an average of 4.8 lessons per week and a median of 6.0 lessons per week. Also, based on the teacher reports and scales with a maximum of 6, students had a mean engagement of 4, a mean for talking about the program with parents of 3.9, and a mean for discussing PA outside the classroom of 3.2. The teacher-reported mean for discussion between parents and teachers was 2.5.

      Baseline equivalence: Intervention children had significantly higher pretest means on 10 of the 11 outcomes.

      Differential attrition: No analysis of missing data or attrition was presented.

      Posttest: The intervention group improved significantly more than the control group on all 11 outcomes and the total scale. Effect sizes ranged from .36 to .72.

      Long-term: Not available.

      Limitations

      • Outcome measures came from ratings of instructors who also delivered the program rather than independent observers.
      • The intervention students had higher mean scores at baseline on nearly all outcomes.
      • No analysis of attrition, but attrition was only 8%.
      • Narrow sample from three preschools in one state.
      • No long-term data.

      Guo, S., Wu, Q., Smokowski, P. R., Bacallao, M., Evans, C. B. R., & Cotter, K. L. (2015). A longitudinal evaluation of the Positive Action Program in a low-income, racially diverse, rural county: Effects on self-esteem, school hassles, aggression, and internalizing symptoms. Journal of Youth and Adolescence, 44, 2337-2358.

      Smokowski, P. R., Guo, S., Wu, Q., Evans, C. B. R., Cotter, K. L., & Bacallao, M. (2016). Evaluating dosage effects for the Positive Action Program: How implementation impacts internalizing symptoms, aggression, school hassles, and self-esteem. American Journal of Orthopsychiatry. Advance online publication, http://dx.doi.org/10.1037/ort0000167.

      Evaluation Methodology

      Design:

      Recruitment: More than 7,000 middle schools and 11 high schools in two rural, economically disadvantaged counties in North Carolina were approached via the North Carolina Academic Center for Excellence in Youth Violence Prevention project. The Guo et al. (2015) and Smokowski et al. (2016) studies used the same data from 4 waves of the Rural Adaptation Project panel data collected between 2011 and 2014. The sample size, depending on year and analysis, ranged from N=1246 to N=5894.

      Assignment: The data came from two counties non-randomly assigned to conditions. The control county sampled all middle school students (6th – 8th grade) in Year One of the study, and all incoming 6th grade students in the subsequent years. The intervention county was larger, and although all students in the county received the intervention, 40% of all middle school students (grades 6-8) were randomly sampled for the first wave, and 500 more incoming 6th grade students were randomly sampled in every subsequent year. The intervention was delivered for three consecutive years. To adjust for non-random county assignment, the analysis used propensity score matching. Guo et al. (2015) compared matched intervention and control groups, while Smolenski et al. (2015) compared matched groups defined by dosage levels.

      The study noted that consent was requested from parents in the intervention schools but apparently not in the control schools.

      Attrition: Data were collected in four waves and the sample size ranged overall from 3715 to 5894 and from 1246 to 1968 for propensity score matching. Other than to say that subjects needed at least two waves of data to be included in the analysis, the study did not provide details on attrition.

      Sample:

      The sample was 52% female with an average age of 12.78. A majority of the sample received free or reduced price lunch (88%) and lived in a two-parent family (92%). The racial/ethnic breakdown of the sample was 27% White, 23% Black, 30% American Indian, 8% Latino, and 12% identified as mixed or “other.”

      Measures:

      Participants were assessed using the School Success Profile Plus, which includes numerous subscales, but only four were used as outcomes in the analysis: self-esteem, aggression, internalizing symptoms, and school hassles. Other measures such as friend rejection, parent-child conflict, religious orientations, school satisfaction, future optimism, parent support, teacher support, friend support, delinquent friends, peer pressure, perceived discrimination, and school danger were used in the matching. The alphas for all the measures ranged over time points from .70 to .97. For the primary outcomes of self-esteem, school hassles, aggression, and internalizing, the alphas ranged over time from .86 to .95.

      Analysis:

      The conditions were matched using two propensity score models (inverse weighting and one-to-one matching) that accounted for the difficulties with non-random assignment by county. Then hierarchical linear modeling was used to analyze individual change over time with controls for baseline covariates used in the matching. Tests for clustering showed an intraclass correlation coefficient of only .03, which led the authors to ignore school clustering in the hierarchical models. Also, the tests for program effects did not examine differences across conditions in change with a time-by-condition term, but appear to examine the average condition difference across all assessments with a condition term alone.

      Intent-to-Treat: The study imputed missing data for subjects with data for at least two assessments. Based on data using multiple imputation, the tables in Guo et al. (2015) listed 1) average treatment effects for an intent-to-treat sample, 2) average treatment effects of the treated for a non-intent-to-treat sample, and 3) effects for a matched subsample of subjects. The sample sizes were not presented clearly, but the study appears to have followed intent-to-treat procedures by using as many subjects as possible in the average treatment effects analysis. Also using multiple imputation, Smokowski et al. (2016) presented results for the full sample of subjects with at least two assessments.

      Outcomes

      Implementation Fidelity:

      Teachers were trained on program implementation, but fidelity was assessed via program dosage, or the number and duration of lessons taught. The authors concluded that fidelity was high, particularly in years 2 and 3.

      Baseline Equivalence:

      Conditions differed significantly on most baseline measures for the original sample, but Guo et al. (2015) stated that the matched samples were balanced and that details could be requested. In Smokowski et al. (2016), equivalence was tested across dosage groups rather than conditions. They noted that 10-11 variables were not balanced after the propensity score adjustment.

      Differential Attrition:

      Perhaps because of the use of multiple imputation, details on attrition or an attrition analysis were not provided.

      Posttest:

      As reported by Guo et al. (2015), the program had a significantly positive effect on scores for self-esteem and school hassles in the intervention group compared to the control group. It did not significantly affect aggression, and it may have had a negative effect on internalizing scores (the authors stated that the effect was significant in a two-tailed test but not in the one-tailed).

      Using the same data, but reporting one-tailed significance tests, Smokowski at al. (2016) found 3 significant dosage effects in 28 tests. Students who received 3 years of the PA intervention and a high number of PA lessons had a significantly higher self-esteem score than those who received 0 years of PA or zero lessons (p < .01). Participants who received one year of PA also reported lower school hassle scores than those who received 0 years, but at p < .05, this may not have been significant with a two-tailed test. Dosage was not related to aggression. One-tailed tests of significance showed no relationship between dosage and internalizing symptoms, but the results were in the opposite direction, showed relatively large coefficients, and may have been significant with 2-tailed tests. Specifically, youth who received 3 years of PA, and those who received a low (0-31) and moderate (63-103) number of lessons, had higher internalizing scores compared to youth who received 0 years of PA or zero lessons.

      Long-Term :

      The data were collected in four waves, but the program was ongoing over the full period.