Computer network defense is a partnership between automated systems and human cyber security analysts. The system behaviors, for example raising a high proportion of false alarms, likely impact cyber analyst performance. Experimentation in the analyst-system domain is challenging due to lack of access to security experts, the usability of attack datasets, and the training required to use security analysis tools. This paper describes Cry Wolf, an open source web application for user studies of cyber security analysis tasks. This paper also provides an open-access dataset of 73 true and false Intrusion Detection System (IDS) alarms derived from real-world examples of “impossible travel” scenarios. Cry Wolf and the impossible travel dataset were used in an experiment on the impact of IDS false alarm rate on analysts’ abilities to correctly classify IDS alerts as true or false alarms. Results from that experiment are used to evaluate the quality of the dataset using difficulty and discrimination index measures drawn from classical test theory. Many alerts in the dataset provide good discrimination for participants’ overall task performance.