Lecture 4
Communications
• General communication skills:
– Oral and written
– Formal and informal
– Talk to people with different level of backgrounds
• Be clear, concise, accurate, and adaptive (elaborate
with examples, summarize by abstraction)
• English proficiency
• Get used to talking to people from different fields
2
Persistence
• Work only on topics that you are passionate about
• Work only on hypothesis that you believe in
• Don’t draw negative conclusions prematurely and
give up easily
– positive results may be hidden in negative results
– In many cases, negative results don’t completely reject
a hypothesis
• Be comfortable with criticisms about your work (learn
from negative reviews of a rejected paper)
• Think of possibilities of repositioning a work (what
3
you learn from an unsuccessful exploration can often
Optimize Your Training
• Know your strengths and weaknesses
– strong in math vs. strong in system development
– creative vs. thorough
– …
• Train yourself to fix weaknesses
• Find strategic partners
• Position yourself to take advantage of your strengths
4
How to identify a good research
problem?
What is a Good Research Problem?
• Well-defined: Would we be able to tell whether you’ve solved
the problem?
• Highly important: Who would really care about the solution to
the problem? Does it solve a big pain?
– Identify fundamental problems
– Dream big to identify novel application opportunities
• Solvable: Is there any clue about how to solve it? Do you
have a baseline approach? Do you have the needed
resources?
• Matching your strength: Are you good at solving this kind of
problems? 6
Challenge-Impact Analysis
Level of Challenges High impact
High risk (hard)
Difficult Good long-term
basic research research problems
Problems,
but questionable impact High impact Novel application
Low risk (easy) research problems
Good short-term
research problems
Low impact
Low risk
Bad research problems
(May not be publishable)
Unknown
Good applications
Not interesting
Known for research
“entry point”
problems Impact/Usefulness
7
Optimizing “Research Return”:
Pick a Problem Best for You
High (Potential)
Impact
Your
Passion
Your Strength
Best problems for you
Find your passion: If you don’t have to work/study for money, what would you do?
Test of impact: If you are given $1M to fund a research project, what would you fund?
Find your strength/Avoid your weakness: What are you (not) good at?
8
How to Find a Problem?
• Application-driven (Find a nail, then make a hammer)
– Identify a need by people/users that cannot be satisfied well
currently (“complaints” about current data/information
management systems?)
– How difficult is it to solve the problem?
• No big technical challenges: do a startup
• Lots of big challenges: write a research proposal
– Identify one technical challenge as your topic
– Formulate/frame the problem appropriately so that you can solve
it
• Aim at a completely new application/function (find a
high-stake nail)
9
How to Find a Problem? (cont.)
• Tool-driven (Hold a hammer, and look for a nail)
– Choose your favorite state-of-the-art tools
• Ideally, you have a “secret weapon”
• Otherwise, bring tools from area X to area Y
– Look around for possible applications
– Find a novel application that seems to match your tools
– How difficult is it to use your tools to solve the problem?
• No big technical challenges: do a startup
• Lots of big challenges: write a research proposal
– Identify one technical challenge as your topic
– Formulate/frame the problem appropriately so that you can solve
it
• Aim at important extension of the tool (find an unexpected
application and use the best hammer)
10
How to Find a Problem? (cont.)
• In practice, you do both in various kinds of ways
– You use your imagination, or talk to people in
application domains to identify new “nails”
– You take courses and read literature to acquire newest
or powerful “hammers”
– You check out related areas for both new “nails” and
new “hammers”
– You read visionary papers and the “future work”
sections of research papers, and then take a problem
from there
– …
11
How to identify, frame, and refine a
research problem?
General Steps to Define a Research
Problem
• Generate and Test
• Raise a question
• Novelty test: Figure out to what extent we know how
to answer the question
– There’s already an answer to it: Is the answer good
enough?
• Yes: not interesting, but can you make the question
more challenging?
• No: your research problem is how to get a better
answer to the raised question
– No obvious answer: you’ve got an interesting problem
to work on
ProQuest
13
General Steps to Define a Research
Problem
• Tractability test: Figure out whether the raised question can
be answered
– I can see a way to answer it or potentially answer it: you’ve got a
solvable problem
– I can’t easily see a way to answer it: Is it because the question is
too hard or you’ve not worked hard enough? Try to reframe the
problem to make it easier
• Evaluation test: Can you obtain a data set and define
measures to test solutions/answers?
– Yes: you’ve got a clearly defined problem to work on
– No: can you think of anyway to indirectly test the solutions/
answers? Can you reframe the problem to fit the data?
• Every time you reframe a problem, try to do all the three tests
again.
14
Frame a New Computation Task
• Define basic concepts
• Specify the input
• Specify the output
• Specify any preferences or constraints
15
From a new application to
a clearly defined research problem
• Try to picture a new system, thus clarify what new functionality is to be
provided and what benefit you’ll bring to a user
• Among all the system modules, which are easy to build and which are
challenging?
• Pick a challenge and try to formalize the challenge
– What exactly would be the input?
– What exactly would be the output?
• Is this challenge really a new challenge (not immediately clear how to
solve it)?
– Yes, your research problem is how to solve this new problem
– No, it can be reduced to some known challenge: are existing methods
sufficient?
• Yes, not a good problem to work on
• No, your research problem is how to extend/adapt existing methods
to solve your new challenge
• Tuning the problem 16
Tuning the Problem
Level of Challenges
Make an easy problem harder
Increase impact (more general)
Make a hard problem easier
Unknown
Known
Impact/Usefulness
17
“Short-Cut” for starting Research
• Scan most recently published papers to find papers that you like or can
understand
• Read such papers in detail
• Track down background papers to increase your understanding
• Brainstorm ideas of extending the work
– Start with ideas mentioned in the future work part
– Systematically question the solidness of the paper (have the authors
answered all the questions? Can you think of questions that aren’t
answered?)
– Is there a better formulation of the problem
– Is there a better method for solving the problem
– Is the evaluation solid?
• Pick one new idea and work on it
18
How to formulate and test
research hypotheses?
Formulate Research Hypotheses
• Typical hypotheses:
– Hypothesis about user characteristics (tested with user studies or
user-log analysis, e.g., click-through bias)
– Hypothesis about data characteristics (tested with fitting actual data,
e.g., Zipf’s law)
– Hypothesis about methods (tested with experiments):
• Method A works (or doesn’t work) for task B under condition C by
measure D (feasibility)
• Method A performs better than method A’ for task B under condition
C by measure D (comparative)
• Introduce baselines naturally lead to hypotheses
• Carefully study existing literature to figure our where exactly you can make
a new contribution (what do you want others to cite your work as?)
• The more specialized a hypothesis is, the more likely it’s new, but a narrow
hypothesis has lower impact than a general one, so try to generalize as
much as you can to increase impact
• But avoid over-generalize (must be supported by your experiments)
• Tuning hypotheses 20
Procedure of Hypothesis Testing
• Clearly define the hypothesis to be tested (include
any necessary conditions)
• Design the right experiments to test it (experiments
must match the hypothesis in all aspects)
• Carefully analyze results (seek for understanding and
explanation rather than just description)
• Unless you’ve got a complete understanding of
everything, always attempts to formulate a further
hypothesis to achieve better understanding
21
Clearly Define a Hypothesis
• A clearly defined hypothesis helps you choose the
right data and right measures
• Make sure to include any necessary conditions so
that you don’t over claim
• Be clear about any justification for your hypothesis
(testing a random hypothesis requires more data
than testing a well-justified hypothesis)
22
Design the Right Experiments
• Flawed experiment design is a common cause of rejection of
an IR paper (e.g., a poorly chosen baseline)
• The data should match the hypothesis
– A general claim like “method A is better than B” would need a
variety of representative data sets to prove
• The measure should match the hypothesis
– Multiple measures are often needed (e.g., both precision and
recall)
• The experiment procedure shouldn’t be biased
– Comparing A with B requires using identical procedure for both
– Common mistake: baseline method not tuned or not tuned
seriously
• Test multiple hypotheses simultaneously if possible (for the
sake of efficiency) 23
Carefully Analyze the Results
• Do the significance test if possible/meaningful
• Go beyond just getting a yes/no answer
– If positive: seek for evidence to support your original
justification of the hypothesis.
– If negative: look into reasons to understand how your
hypothesis should be modified
– In general, seek for explanations of everything!
• Get as much as possible out of the results of one
experiment before jumping to run another
– Don’t throw away negative data
– Try to think of alternative ways of looking at data 24
Modify a Hypothesis
• Don’t stop at the current hypothesis; try to generate a
modified hypothesis to further discover new
knowledge
• If your hypothesis is supported, think about the
possibility of further generalizing the hypothesis and
test the new hypothesis
• If your hypothesis isn’t supported, think about how to
narrow it down to some special cases to see if it can
be supported in a weaker form
25
Derive New Hypotheses
• After you finish testing some hypotheses and
reaching conclusions, try to see if you can derive
interesting new hypotheses
– Your data may suggest an additional (sometimes
unrelated) hypothesis; you get a by-product
– A new hypothesis can also logically follow a current
hypothesis or help further support a current
hypothesis
• New hypotheses may help find causes:
– If the cause is X, then H1 must be true, so we test H1
26
Summary
• Research is about discovery and increase our knowledge
(innovation & understanding)
• Intellectual curiosity and critical thinking are extremely
important
• Creativity is a big plus!
• Work on important problems that you are passionate about
• Aim at becoming a top expert on one topic area
– Obtain complete knowledge about the literature on the topic (read
all the important papers and monitor the progress)
– Write a survey if appropriate
– Publish one or more high-quality papers on the topic 27