Giovanni Cerulli, Research Institute on Sustainable Economic Growth, National Research Council of Italy, Unit of RomeRunning Machine Learning in Stata: Performance and usability evaluationThis presentation provides a comprehensive survey reviewing machine learning (ML) commands in Stata. It will systematically categorize and summarize the available ML commands in Stata and evaluate their performance and usability for different tasks such as classification, regression, clustering, and dimension reduction. The presentation also provides examples of how to use these commands with real-world datasets and compare their performance. This review aims to help researchers and practitioners choose appropriate ML methods and related Stata tools for their specific research questions and datasets and to improve the efficiency and reproducibility of ML analyses using Stata. It concludes by discussing some limitations and future directions for ML research in Stata.LinksWatch presentation |
Mark Schaffer, Heriot-Watt Universitypystacked and ddml: Machine learning for prediction and causal inference in StataThis presentation explores the pystacked and ddml commands in Stata.LinksWatch presentation |
Meghan Cain, StataCorpBayesian model averagingAre you unsure which predictors to include in your model? Rather than choosing one model, aggregate results across all candidate models to account for model uncertainty with Bayesian model averaging (BMA). Which predictors are important given the observed data? Which models are more plausible? How do predictors relate to each other across different models? BMA can answer these questions and many more. Stata 18 introduced the bma suite of commands to perform BMA in linear regression models. In this talk, you will learn how to explore influential models, make inferences, and obtain better predictions with BMA. I will demonstrate the utility of BMA for any researcher—Bayesian, frequentist, and everyone in between! No prior knowledge of the Bayesian framework is required.LinksWatch presentation |
Marea Sing, RBNZ’s Economics Directorate, and Guanyu Zheng, NZ Ministry of Business, Innovation and EmploymentSectoral reallocation and income growth in the labour market during the COVID-19 pandemicThis paper investigates the effects of the COVID-19 pandemic on the labour market in New Zealand. Utilizing a comprehensive administrative dataset, we delve into the intricacies of labour reallocation during the pandemic, while establishing links between these reallocations and two distinct measures of income growth. Our findings reveal that COVID-19 presented as an atypical and relatively persistent reallocation shock to the New Zealand labour market. Notably, the surge in job-to-job transitions primarily stemmed from transitions between industries, rather than those within industries. Moreover, it is these between-industry transitions that exhibited a positive correlation with overall income growth in the labour market.LinksWatch presentation |
Arul Earnest, Monash UniversityMachine Learning Techniques to Predict Timeliness of Care among Lung Cancer PatientsDelays in the assessment, management, and treatment of lung cancer patients may adversely impact prognosis and survival. This study is the first to use machine learning techniques to predict the quality and timeliness of care among lung cancer patients, utilising data from the Victorian Lung Cancer Registry (VLCR) between 2011 and 2022, in Victoria, Australia.LinksWatch presentation |
Andrew Gray, University of OtagoChatGPT and other large language models: How useful are they to statisticians using Stata?Some statisticians, including Stata users, are already using ChatGPT and other LLMs, either for answers to questions about statistics, code generation, or data processing (e.g., sentiment analysis). Some researchers may already be using the technology to automatically perform their analyses. This presentation explores these four uses through examples and brief case studies.LinksWatch presentation |
Nyi Nyi Naing, Universiti Sultan Zainal AbidinBeauty of STATA: Relevant and plausibleSTATA software makes it easy for users in medical and health sciences research fields because of its easy data transfer from other databases, competent intermediate and advanced statistical methods by both common and menu options, relevant and meaningful output for making inferences, interpretation and conclusion for both interventional (clinical and community trials), and observational studies (Cohort, Case-control and Cross-sectional studies as examples). It is also applicable and friendly to determine minimum required sample size with appropriate power for those studies. Various regression methods, general linear models and cross-sectional time series are frequently used by these researchers. Step by step procedures of statistical analyses using STATA are taught to academic staff in universities, researchers at research institutes, clinicians and health personnel at ministry of health, biostatisticians, epidemiologists and pharmaceutical companies staff from the levels of basic, intermediate to advanced. Output of epidemiological studies are much superior to those of other software in terms of relevance and biological plausibility.LinksWatch presentation |
Mark Chatfield, University of QueenslandNice log (and log-like) scaled axesIn this presentation, Mark will show how to i) create graph commands which nicely label a log-scaled axis and ii) produce a nice log-like-scaled axis showing 0 and ∞. With the exception of meta forestplot, Stata does not automatically label a log-scaled axis with multiplicative labels, e.g. 1/4, 1/2, 1, 2, 4. With a twoway graph, specifying yscale(log) will create a log-scaled y-axis but with additive labels, e.g. 1, 2, 3, 4. The niceloglabels command (Cox 2018) can suggest a variety of nice multiplicative labels, which can benefit community- contributed graph commands that use log-scaled axes. However, decisions still need to be made such as when to choose which set of labels. There is no log- scale equivalent of _natscale to do this for you. I will show how I overcame this for my blandaltman and box_logscale commands (Chatfield 2023). The latter is an example of working with log-transformed data but labelling the axis with multiplicative, original-scale labels. The mylabels command (Cox 2022) is helpful here. I will also show how to use other transformations such as asinh(y/#) or logistic(#*log(y/#)) to produce a nice log-like-scaled axis showing 0 and ∞.LinksWatch presentation |
Facilitated Panel Session - Teaching StataWith Tai Bee Choo, National University of Singapore, Siew-Pang Chan, National University of Singapore, and Chris Erwin, Auckland University of TechnologyStata, a globally recognized software, is pivotal in teaching statistics and data analysis across diverse university disciplines, including biostatistics, economics, econometrics, epidemiology, health sciences, and social sciences. This panel session offers a unique opportunity to delve into the experiences of three distinguished lecturers who have extensively utilized Stata in their teaching endeavors for many years.Watch presentation |
|
David White and Amy Grant, SDASAnswering Stata assignments using Generative Artificial Intelligence: An ExampleChatGPT and Bard are now part of the research landscape. They are tools being used daily by students, professionals, academics and researchers. We can choose to ignore them or acknowledge that they have a part in our practice. In this presentation, Amy and David demonstrate how these tools can be used (ineffectively and effectively) to develop answers to real assignment questions using Stata.Watch presentation |
Zumin Shi, Qatar UniversityEpiTableExporting results of multivariable model to a word document can be time consuming. This presentation covers the epitable2 and epitable3 packages developed to create table 2 and table 3 used in epidemiological studies.LinksWatch presentation |