That’s right. The lexical hypothesis only comes in at step 1 by including questions like “I am [adjective].” We start with a vague theory in the questionnaire and apply dimension reduction. The lexical hypothesis is that language gives us a vague theory. We want as broad a theory as possible, so it is useful to combine questionnaires. Some sources claim that the original questionnaire was generated from language without questions from explicit theories, but I don’t think that’s correct.
I notice I am confused. I was sure that the FFM came out of doing the following simple procedure:
Give people a many-item personality survey
Do a PCA of the resulting data
Keep the top 5 eigenvectors
Label them with reasonably accurate adjectives that seem to describe the general drift of the vector
How wrong is this? How important is the “lexical hypothesis” part?
That’s right. The lexical hypothesis only comes in at step 1 by including questions like “I am [adjective].” We start with a vague theory in the questionnaire and apply dimension reduction. The lexical hypothesis is that language gives us a vague theory. We want as broad a theory as possible, so it is useful to combine questionnaires. Some sources claim that the original questionnaire was generated from language without questions from explicit theories, but I don’t think that’s correct.