Warning: You are not logged in. Your IP address will be publicly visible if you make any edits. If you log in or create an account, your edits will be attributed to your username, along with other benefits.

The edit can be undone. Please check the comparison below to verify that this is what you want to do, and then save the changes below to finish undoing the edit.
Latest revision Your text
Line 1: Line 1:
{{av}}
 
 
 
The quality of your automatic content analysis depends on the quality of your search strings, which in turn depends on the reliability of your search strings. The reliability of a study refers to the question of whether this study is replicable or not. When replicable, other researchers find the same results using your research method. Besides reliable, your research method should also be valid, meaning that you actually measure what you intend to measure. Inreliable search terms per definition lead to invalid results. In automatic content analysis reliability is high, as different computers with identical instructions will generate the exact same results. However, validity is low, since a computer recognizes words but not concepts. In contrast, human coders with the same instructions will not always get the same results due to personal interpretations and cultural backgrounds, so the reliability of manual content analysis is often lower than the reliability of automatic content analysis. However, human coders are capable of recogizing concepts, which improves the validity of the results.  
 
The quality of your automatic content analysis depends on the quality of your search strings, which in turn depends on the reliability of your search strings. The reliability of a study refers to the question of whether this study is replicable or not. When replicable, other researchers find the same results using your research method. Besides reliable, your research method should also be valid, meaning that you actually measure what you intend to measure. Inreliable search terms per definition lead to invalid results. In automatic content analysis reliability is high, as different computers with identical instructions will generate the exact same results. However, validity is low, since a computer recognizes words but not concepts. In contrast, human coders with the same instructions will not always get the same results due to personal interpretations and cultural backgrounds, so the reliability of manual content analysis is often lower than the reliability of automatic content analysis. However, human coders are capable of recogizing concepts, which improves the validity of the results.  
  
Line 8: Line 6:
  
 
You can check the face validity of your search terms by taking a look at the AmCAT search results. You can do so by reading the articles that are identified as including your search terms and estimating whether they include the concept you intend to measure. AmCAT provides you with various opportunities to get access to these articles:
 
You can check the face validity of your search terms by taking a look at the AmCAT search results. You can do so by reading the articles that are identified as including your search terms and estimating whether they include the concept you intend to measure. AmCAT provides you with various opportunities to get access to these articles:
* Using the [[3.3:Summary|Summary function]], you can list all the articles including your search terms. You can access each of these documents by clicking on the titles in the list. You search terms are highlighted in red.
+
* Using the [[Summary|Summary function]], you can list all the articles including your search terms. You can access each of these documents by clicking on the titles in the list. You search terms are highlighted in red.
* Using the 'Graph' option of the [[3.3:Graph/Table|Graph/Table function]], you can click on every dot in the line and you will get a list of relevant articles. By clicking on the titles in the list you can access each article.  
+
* Using the 'Graph' option of the [[Graph/Table|Graph/Table function]], you can click on every dot in the line and you will get a list of relevant articles. By clicking on the titles in the list you can access each article.  
* Using the 'ClusterMap' option of the [[3.3:Summary|Summary function]], you can make a Venn diagram. By clicking on a dot in the Venn diagram, you get access to this particular article. If you have a large number of articles, the venn digram displays a single large dot. By specifying your search instructions by selecting a certain period or medium, you can narrow the number of articles down and dots will appear.  
+
* Using the 'ClusterMap' option of the [[Summary|Summary function]], you can make a Venn diagram. By clicking on a dot in the Venn diagram, you get access to this particular article. If you have a large number of articles, the venn digram displays a single large dot. By specifying your search instructions by selecting a certain period or medium, you can narrow the number of articles down and dots will appear.  
  
 
== Reliability ==
 
== Reliability ==
Line 28: Line 26:
  
  
So, how can you determine the number of true and false positives? You can use the Query function in AmCAT. The most simple way to check if your search results actually measure the intended concepts is by displaying them in the context in which they occur. When you select the [[3.3:Summary|Summary function]], you get a list with search results and the context within which the search terms occur. You can calculate the precision by drawing a sample of X articles. For each article in this sample you check whether your search string has measured what you intended to measure in this particular article. If so, you label this article a true positive. If not, you label this article a false positive. Let's say, for example, that 13 of a total of 50 articles in your sample are false positives. 50 - 13 = 37 true positive > 37/50 = .74. Your precision would thus be 74%.   
+
So, how can you determine the number of true and false positives? You can use the Query function in AmCAT. The most simple way to check if your search results actually measure the intended concepts is by displaying them in the context in which they occur. When you select the [[Summary|Summary function]], you get a list with search results and the context within which the search terms occur. You can calculate the precision by drawing a sample of X articles. For each article in this sample you check whether your search string has measured what you intended to measure in this particular article. If so, you label this article a true positive. If not, you label this article a false positive. Let's say, for example, that 13 of a total of 50 articles in your sample are false positives. 50 - 13 = 37 true positive > 37/50 = .74. Your precision would thus be 74%.   
 
   
 
   
 
== Recall ==
 
== Recall ==

Please note that all contributions to AmCAT are considered to be released under the Creative Commons Attribution (see AmCAT:Copyrights for details). If you do not want your writing to be edited mercilessly and redistributed at will, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource. Do not submit copyrighted work without permission!

Cancel | Editing help (opens in new window)

Template used on this page:

AmCAT Version
This page describes a feature in AmCAT
View other version: 3.3 - 3.4 - 3.5