My Personal GPT Prototype: CNI Insights Agent
The concept of a "personal GPT" is fascinating. It enables for the curation of a customized knowledge base or a specialized collection of data and documents, allowing for interactive engagement.
With the advent of word processing, our interaction with text transformed. We gained the ability to copy and paste, conduct quick searches, and smoothly scroll through content. This evolution also enabled the immediate integration of graphics, dictionaries, and thesauruses, all while allowing us to reformat text effortlessly.
The personal GPT takes these capabilities to a new level. It empowers us to quickly summarize multiple documents, analyze & probe them, detect biases, measure sentiment, and even perform tasks like classification, clustering, or translation.
Beyond that: a personal GPT allows us to creatively transform text into poems, outlines, visual representations, prose, or language tailored to different audience types. This represents a significant leap in how we interact with and utilize words and ideas.
My Prototype
I’ve used ChatGPT 3.5 a lot this year, but when 4.0 enabled the personal GPT option—I had to try it. As a proof of concept, I scraped the transcripts from the last three CNI meetings.
Spring 2023 181,507 words
Fall 2022 165,629 words
Spring 2022 187, 962 words
These three large documents formed a knowledge base that I titled: CNI Insights Agent.
This is intriguing from two perspectives. First, I can engage with approximately 20 presentations within a single conference. Secondly, when considering all three conferences collectively, there is a cumulative total of more than 60 presentations.
CNI enables us to watch all these talks via their YouTube channel. But now, instead of just passively viewing, I can do so much more.
Exploring Themes & Concepts:
With the agent, I explored the overarching themes spanning across all three conferences:
Collaborative Projects and Cultural Heritage: There's a significant emphasis on collaborative projects, particularly in the context of cultural heritage. The documents discuss initiatives that involve cross-departmental and cross-institutional collaboration, leveraging collective resources for greater impact.
Digital Transformation and Innovation: The importance of digital transformation in preserving and enhancing access to collections and resources is a key focus. This includes developing new digital tools and platforms to facilitate research and education.
Metadata and Data Management: There's a strong focus on improving metadata and data management practices. This includes discussions on the challenges of integrating disparate data sources and the importance of robust metadata for enhancing discoverability and usability of digital assets.
Openness and Accessibility: Themes of openness and accessibility recur, with an emphasis on making digital resources widely accessible while addressing issues like bias and racism embedded in metadata.
Sustainability and Long-term Viability: The documents highlight the importance of ensuring the sustainability and long-term viability of digital projects and infrastructure.
I also examined a particular topic at a specific conference. When I investigated the theme of “open science” during the Spring 2022 conference these insights emerged:
Open Access Policy Implementation Challenges.
Open Infrastructure Systems
Challenges with Identifier Schemas
Pathways to Open Access Strategy
Assessing Openness of Pathways and Infrastructure
Publisher Negotiations and Support for Persistent Identifiers
Community Initiatives and Open Infrastructure
Enriching and Opening Metadata
Navigating Challenges Without Compromising Open Access Values
Now let’s compare the same topic (open science) at the Spring 2023 conference to see what’s different:
Digital Humanities and Community Engagement
Interdisciplinary Research and Open Science
Accessible and Inclusive Open Research Practices
Technology-Driven Solutions
Community-Based Research Approaches
Policy and Advocacy in Research
The Agent actually provided paragraphs summarizing each of the bullets above For brevity I just listing the titles. However, I can elect to go deeper. As an example, here is the response I received when I requested additional information about “Community-Based Research Approaches” and asked for it in prose form:
Community-Based Research Approaches, as highlighted in the CNI Spring 2023 insights, represent a significant shift in the academic and research landscape towards more inclusive, collaborative, and impact-driven methodologies. This approach fundamentally rethinks the traditional researcher-subject dynamic, positioning local communities not just as subjects of study but as active partners in the research process.
At the core of community-based research is the principle of co-creation of knowledge. This means that researchers work closely with community members to identify research questions, design methodologies, and interpret findings. Such an approach ensures that the research is grounded in the actual needs and contexts of the communities involved, thereby increasing its relevance and potential for positive impact.
One of the key elements of community-based research is its emphasis on mutual learning and respect. Researchers bring their academic expertise, while community members contribute their lived experiences and local knowledge. This exchange enriches the research process, leading to more nuanced and comprehensive understandings of complex social, environmental, or cultural issues.
Another important aspect is the focus on actionable outcomes. Community-based research often aims to address specific challenges or improve certain conditions within the community. This could involve developing new social programs, influencing policy changes, or providing critical data to support community advocacy efforts. The goal is not just to generate knowledge but to catalyze real-world change.
Additionally, this approach advocates for the democratization of knowledge. By involving community members in the research process and making findings accessible and understandable, it breaks down barriers between academia and the public. This is in line with the principles of open science, which seek to make scientific research more transparent, accessible, and beneficial for the wider society.
In summary, community-based research approaches signify a more engaged, equitable, and impactful way of conducting research. By forging partnerships with local communities and focusing on real-world applications, this approach has the potential to not only advance academic knowledge but also drive meaningful societal progress.
I can perform various types of searches. For example, here is what I received when I asked for information about “cloud lab”
The "Cloud Lab" concept, as detailed in my knowledge sources, refers to a highly automated and remote-controlled laboratory setup that allows researchers to conduct experiments without physically being in a lab. This system was notably developed at Carnegie Mellon University (CMU), modeled on the Emerald Cloud Lab in San Francisco. Key aspects of the Cloud Lab include:
1. Remote Experimentation: Researchers design their experiments using an operating system, and these designs are communicated to the Cloud Lab. The research is conducted remotely, and the resulting data sets are returned to the researcher for analysis
2. Efficiency and Accessibility: The Cloud Lab is structured with racks that allow quick addition or withdrawal of equipment, optimizing for efficiency. It is being developed in CMU's library's offsite warehouse, which reflects the intersection of the future and the history of science. This approach eliminates the need for science faculty to negotiate expensive startup packages for designing their own labs, as they are provided access to the Cloud Lab instead,
3. Automation and Control: Researchers can execute experiments by issuing commands over the internet to a central facility. This setup allows control over a vast array of parameters and operations, from very small to large-scale transfers and processes. There are over 200 different models of top-class instrumentation available online, and the facilities can run experiments on demand, 24/7, throughout the year.
4. Data Management and Open Science: The Cloud Lab also focuses on tracking experiments and managing the return of data sets for modeling and analysis. There is an emphasis on integrating the Cloud Lab's operations with the broader objectives of open science. This integration includes tools, training, events, collaboration, outreach, and assessment, aiming to provide an end-to-end approach for open science that encompasses everything from experimental design to the reuse of data.
Overall, the Cloud Lab represents a significant shift in how scientific research can be conducted, emphasizing remote control, automation, efficiency, and integration with open science principles.
Sentiment
I’m also curious about the idea of sentiment analysis. Here is an example of what I received when seeking summaries of talks that are optimistic:
Paul Courant's Keynote Speech on Libraries and Digital Utopia: This speech by Paul Courant, the 2022 Paul Evan Peters Award winner, is notably optimistic. He discusses the intersection of libraries, digital technology, and scholarly work with a clear sense of hope and possibility. Courant reflects on the advancements in digital libraries, highlighting the progress made in the digitization of books and the creation of expansive digital libraries. He emphasizes the potential of these digital resources to transform how knowledge is accessed and used, underscoring the positive impact on academic research and public access to information.
I can also apply quantitative measures to the idea of sentiment. Here is an example from a presentation that I personally enjoyed watching:
Machine Learning, Text Summarization, and Optimizing Scholarship for Citizen Audiences and Discovery (Clark, Sterman, Laden) (video)
To perform a quantitative sentiment analysis of this text, I will use natural language processing techniques. The sentiment analysis will assess the overall sentiment (positive, neutral, or negative) of the text and provide a quantitative score. The text will be analyzed in segments to capture the sentiment expressed in different parts of the presentation, as the content covers various topics and aspects of the research. Let's proceed with the analysis.
The sentiment analysis of the selected text segments from the presentation yielded the following results:
Title: "Machine Learning, Text Summarization, and Optimizing Scholarship for Citizen Audiences and Discovery"
- Sentiment: Neutral (Polarity: 0.0, Subjectivity: 0.0)
Abstract: Discussion about the accessibility of research concepts to non-expert audiences, and the role of librarians in promoting and translating research for diverse audiences.
- Sentiment: Slightly Positive (Polarity: 0.0986, Subjectivity: 0.3542)
Introduction: Introduction of the presenters and acknowledgement of their colleagues.
Sentiment: Positive (Polarity: 0.2, Subjectivity: 0.3)
Translation Section: Advocacy for translation of scholarly work and treating scholarship as a conversation, emphasizing the role of libraries.
- Sentiment: Somewhat Positive (Polarity: 0.18, Subjectivity: 0.5983)
Conclusion: Invitation for further discussion and encouragement to engage with their work on GitHub.
- Sentiment: Fairly Positive (Polarity: 0.4333, Subjectivity: 0.6667)
Overall, the presentation exhibits a generally positive tone, with the highest positivity observed in the conclusion. The abstract, while slightly positive, shows a more neutral-to-positive sentiment compared to other sections, reflecting a more factual and descriptive tone. The introduction and translation sections also convey positive sentiments, indicating a constructive and optimistic approach towards the topics discussed.
Regeneration
What I really enjoy about Generative AI is the ability to reshape ideas. The CNI Insights Agent struggled to do this at scale across all three conferences. But let’s take an example from one talk – and transform it.
Don Waters
Lightening Round Presentation, CNI Spring 2023
Information Infrastructure to Address Societal Grand Challenges
1,127 words
Concise Summary:
Don Waters' talk, affiliated with CNI and previously with the Mellon Foundation, centers on the critical role of digital information infrastructure in advancing interdisciplinary knowledge, especially in addressing climate change. He labels climate change a "wicked" challenge, requiring collaborative efforts across various fields, including humanities, to understand and tackle its complexities. Waters emphasizes the contributions of disciplines like philosophy, religious studies, and the arts in shaping our understanding of justice, the good life, and cultural perspectives. He also highlights the increasing demand for information infrastructure in university research centers focused on climate change, stressing the need for tools that support collaboration, literature management, data mining, and visualization. Waters concludes by inviting discussions on scaling services and infrastructure for interdisciplinary research, particularly in climate studies.
His presentation as a poem:
In digital realms where knowledge streams,Don Waters speaks, his vision gleams.
With CNI's guiding light,
And Mellon's past, his insight's bright.
"Climate change," he says, "a wicked foe,
Complex, vast, in constant flow.
Needs minds combined, both far and wide,
In fields diverse, side by side.
Humanities, with wisdom old,
In stories told, in courage bold.
Philosophy, with thoughtful gaze,
Religious texts, with ancient phrase.
Arts and letters, history's page,
Anthropology's wise sage,
Together weave a richer view,
Of what is just, and good, and true.
In university halls, there's a quest,
For tools to meet this grand test.
Collaboration's vital role,
In mining data, reaching goal.
Visualize, model, understand,
Across this vast and changing land.
A digital bridge, a shared mission,
To meet the challenge, our ambition.
Let's scale our reach, our service height,
In climate's fight, with all our might.
Join hands, join hearts, in this endeavor,
For a future safe, forever and ever."
Thus, Don Waters, in wisdom's light,
Urges us to join the fight.
With digital tools, and hearts combined,
A better world, we're sure to find.
Pitching his presentation to media:
Dear [Journalist's Name],
I'm reaching out to highlight a groundbreaking talk by Don Waters, associated with CNI and formerly with the Mellon Foundation, focusing on the intersection of digital infrastructure and climate change. Waters emphasizes the need for interdisciplinary collaboration, integrating arts, humanities, and science, to address this global challenge. His insights on the growing demand for digital tools in university research centers for climate studies offer a fresh perspective on tackling environmental issues. This topic aligns well with current interests in technology, academia, and sustainable solutions, making it a compelling story for your audience.
Best,
[Your Name]
Pitching his presentation to an elementary school audience:
Dear [School Principal/Teacher's Name],
I'm thrilled to share an opportunity for your young students to learn about climate change in a fun and engaging way! Don Waters, an expert in the digital world, recently gave a talk that's perfect for sparking curiosity and imagination among elementary school students.
Imagine this: our planet is facing a big challenge with climate change, and we need heroes to help solve it. Don Waters shows how not all heroes wear capes – some use computers and cool technology to save the world! He talks about how people from different areas, like scientists, artists, and even students like yours, can work together using digital tools to make our planet healthier and happier.
This talk can be a great way to introduce your students to the importance of teamwork, creativity, and technology in solving big problems like climate change. It's a chance for them to see how learning today can turn them into the problem-solvers of tomorrow!
Would you be interested in arranging a fun and educational session for the students with this theme? I'd be happy to help set it up.
Looking forward to your response!
Warm regards,
[Your Name]
Re-casting his talk as a visual representation:
Wrap-up
That’s a very quick glimpse into what I’ve been doing over the Thanksgiving Break. I’ll share more posts on the CNI Insights Agent in the very near future.
Overall, I’m impressed with OpenAI’s personal GPT tool. It enables us to engage deeply across a series of texts. The CNI presentations, in particular, offer an abundance of thoughtful and pragmatic content and they serve as an excellent sandbox for this type of exploration.