Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is it possible to provide suggested "subquestions" as well as a main question? #380

Open
SeanJAmmirati opened this issue Mar 6, 2024 · 10 comments

Comments

@SeanJAmmirati
Copy link

It looks like the package only provides the ability to input one main question and it automatically generates the subquestions. This is very cool, but is it possible to also suggest subquestions to be included with the main question?

researcher = GPTResearcher(query, subquestions, report_type)
subquestions is array of sub questions

Thanks in advance for your advice / feedback.

@assafelovic
Copy link
Owner

Hey @SeanJAmmirati can you please share more about the motivation for this feature? What are you trying to achieve with it and what output are you expecting?

@SeanJAmmirati
Copy link
Author

Thanks for getting back to me @assafelovic!

The motivation for this feature is to generate a "larger report" that contains multiple defined sub-themes but all fit together into one unified report under the one main top-level query.

Based on my experience using GPTResearcher so far it does a great job automatically generate a set of sub questions and flush out a report based on a primary top-level query. What I'd like to do is say: here is a primary top-level query, here are some sub-questions to that query and have it use that outline to build a unified report that answers the primary question by answering the subquestions (and where necessary adds for lack of a better term "sub sub questions".)

Today Output:
Primary Top-Level Query (user inputed)
1. Sub question (AI generated)
2. Sub question (AI generated)
3. et c...

Requested Output:
Primary Top-Level Query (user inputed)
1. Sub Question (user inputed but answer AI generated)
2. Sub Question (user inputed but answer AI generated including sub sub questions)
2a. AI generated
2b. AI generated
2c. AI generated
3. Sub Question (user inputed)
4. etc ...

I realize you could use GPTResearcher to just build reports for each Sub Question as standalone and then merge them into one document, but then you don't have the advantage of a unified answer.

Let me know if this makes sense?

@proy9714
Copy link
Contributor

@SeanJAmmirati It absolutely makes sense! In fact I was considering using this approach myself for generating a detailed report. This would overcome the limitation of having the report restricted to two pages.

But there are some challenges:

  1. Would need to make a LLM call to arrange sub-topics in a logical manner.
  2. Would need to make multiple LLM calls one for generating report on any subtopic.
  3. Would need to ensure in the prompt that what is previously generated by the LLM for one-subtopic is not generated again during the next call...to avoid duplicate information.
  4. Costly, as there would be multiple LLM calls.
  5. Would take much longer to generate the report.

If this seems ok to implement then I have some ideas which I can try to implement @assafelovic .

@assafelovic
Copy link
Owner

@proy9714 this is a long awaited feature with huge impact so happy to see what you bring! I'd take a look at here as well for inspiration: https://github.com/langchain-ai/langgraph/blob/main/examples/storm/storm.ipynb

It uses the STORM methodology. Generally you'd generate subsections and then for each generate content leading to much longer and more detailed research reports.

Looking forward to see your further impact on this project! As regards to costs, maybe we can consider adding a flag in config file for each one report generation or using the "advanced in depth" for longer but costlier reports. Also feel free to ping me on the Discord community channel to further discuss! 🚀

@proy9714
Copy link
Contributor

Ok sure...on it 🫡

@therealtimex
Copy link

@proy9714 this is a long awaited feature with huge impact so happy to see what you bring! I'd take a look at here as well for inspiration: https://github.com/langchain-ai/langgraph/blob/main/examples/storm/storm.ipynb

It uses the STORM methodology. Generally you'd generate subsections and then for each generate content leading to much longer and more detailed research reports.

Looking forward to see your further impact on this project! As regards to costs, maybe we can consider adding a flag in config file for each one report generation or using the "advanced in depth" for longer but costlier reports. Also feel free to ping me on the Discord community channel to further discuss! 🚀

I strongly recommend the Storm methodology. I'm glad that you brought it up.

@SeanJAmmirati
Copy link
Author

Hi @proy9714 how is this going? I'm not an engineer so can't help with development, but let me know if there is any way I can support you on this work.

@proy9714
Copy link
Contributor

Hi @proy9714 how is this going? I'm not an engineer so can't help with development, but let me know if there is any way I can support you on this work.

Hey, sorry for the late response. I'm on vacation right now....will be available early next month.

@dphiggs01
Copy link
Contributor

Hi @assafelovic @proy9714 @SeanJAmmirati ,

I'm very new to this project and just going through the code for the first time. (Very cool stuff!)

The STORM methodology seems interesting. But as I was reading the comments, I thought a simple, intermediate solution might be to update the current prompt that generates subtopics.

You could inject the user-provided subtopics into the current prompt that requests the LLM to build additional topics.

The prompt should be engineered such that {max_iterations} are preserved, the main {task} is primary, and the provided {subtopics} must be included in the output. Additional topics should not repeat any provided topic and should fill any gaps in context to create a cohesive outline.

I have no expertise in prompt engineering, so I am not sure how well an LLM would do on this task.
I'd love to hear your thoughts on this simplified approach. If it does not align also fine to hear that too!

Thanks,
Dan

@proy9714
Copy link
Contributor

Hey @dphiggs01 I have in fact thought about this same approach. Combining user provided subtopics with ones generated by the AI. Absolutely possible. Will definitely get to it when I have time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants