Table of Contents
Fetching ...

Evaluation of Task Specific Productivity Improvements Using a Generative Artificial Intelligence Personal Assistant Tool

Brian S. Freeman, Kendall Arriola, Dan Cottell, Emmett Lawlor, Matt Erdman, Trevor Sutherland, Brian Wells

TL;DR

This study evaluates the productivity improvements achieved using a generative artificial intelligence personal assistant tool developed by Trane Technologies and indicates significant productivity enhancements, particularly for tasks involving summarization and instruction creation, with improvements ranging from 3.3% to 69%.

Abstract

This study evaluates the productivity improvements achieved using a generative artificial intelligence personal assistant tool (PAT) developed by Trane Technologies. The PAT, based on OpenAI's GPT 3.5 model, was deployed on Microsoft Azure to ensure secure access and protection of intellectual property. To assess the tool's productivity effectiveness, an experiment was conducted comparing the completion times and content quality of four common office tasks: writing an email, summarizing an article, creating instructions for a simple task, and preparing a presentation outline. Sixty-three (63) participants were randomly divided into a test group using the PAT and a control group performing the tasks manually. Results indicated significant productivity enhancements, particularly for tasks involving summarization and instruction creation, with improvements ranging from 3.3% to 69%. The study further analyzed factors such as the age of users, response word counts, and quality of responses, revealing that the PAT users generated more verbose and higher-quality content. An 'LLM-as-a-judge' method employing GPT-4 was used to grade the quality of responses, which effectively distinguished between high and low-quality outputs. The findings underscore the potential of PATs in enhancing workplace productivity and highlight areas for further research and optimization.

Evaluation of Task Specific Productivity Improvements Using a Generative Artificial Intelligence Personal Assistant Tool

TL;DR

This study evaluates the productivity improvements achieved using a generative artificial intelligence personal assistant tool developed by Trane Technologies and indicates significant productivity enhancements, particularly for tasks involving summarization and instruction creation, with improvements ranging from 3.3% to 69%.

Abstract

This study evaluates the productivity improvements achieved using a generative artificial intelligence personal assistant tool (PAT) developed by Trane Technologies. The PAT, based on OpenAI's GPT 3.5 model, was deployed on Microsoft Azure to ensure secure access and protection of intellectual property. To assess the tool's productivity effectiveness, an experiment was conducted comparing the completion times and content quality of four common office tasks: writing an email, summarizing an article, creating instructions for a simple task, and preparing a presentation outline. Sixty-three (63) participants were randomly divided into a test group using the PAT and a control group performing the tasks manually. Results indicated significant productivity enhancements, particularly for tasks involving summarization and instruction creation, with improvements ranging from 3.3% to 69%. The study further analyzed factors such as the age of users, response word counts, and quality of responses, revealing that the PAT users generated more verbose and higher-quality content. An 'LLM-as-a-judge' method employing GPT-4 was used to grade the quality of responses, which effectively distinguished between high and low-quality outputs. The findings underscore the potential of PATs in enhancing workplace productivity and highlight areas for further research and optimization.
Paper Structure (23 sections, 3 equations, 9 figures, 9 tables)

This paper contains 23 sections, 3 equations, 9 figures, 9 tables.

Figures (9)

  • Figure 1: Respondent gender and age
  • Figure 2: Respondent education and experience with gen AI
  • Figure 3: Respondent roles
  • Figure 4: Aggregated results for tasks
  • Figure 5: Evaluation of respondent's age vs completion time for tasks
  • ...and 4 more figures