Scope
Today have to do something with pivot in dynamic SQL, I tweaked the scenario a bit so I am not giving out any client information.
Objectives
To create a sproc that retrieve a list of entries (candidates) with their basic info, answers (if any) for their corresponding questions in a result set.
Database Table Structure
- Candidates
- CandidateCategories
- CandidateInCategories
- Core Questions
- Core Answers
- CandidateID - FK
- CoreQuestionID - FK
- Optional Questions
- CandidateCategoryID - FK
- Optional Answers
- CandidateID – FK
- OptionalQuestionID - FK
Technical Requirements
- Performance is not an issue, it will run less than 5 times a year
- Need to pull out basic info of the candidate and their answer to questions, 1 candidate per row.
- All Core and all Optional Questions will become the column header of the result set.
- Depends on which categories that candidates in, they will have zero to all optional answers.
- An extra column to be generated to indicate the Candidate Category, just before the columns of the set of questions.
- Categories are to be sorted in order base on a column name CandidateCategory.CategoryOrder
- Questions are to be sorted in order base on a column name QuestionNumber in CoreQuestion and OptionalQuestion
- This column is ‘yes’ / ‘no’ depends if the candidate had answered any optional questions in that category.
- Questions are unlimited and dynamically entered by clients.
Technical Overview
- The result set is some kind of join between Candidate (for the basic info) and questions / answers.
- Obviously pulling out questions by using select * from then join will not get the questions span to the column names, this is some kind of transpose with pivot, but it is only partial. Basic info do not require transpose.
Solution Overview
CTE for information
Let’s walk through the code steps by steps. I know I need to get information from multiple tables, this information will be used exactly twice. Once to generate the questions in column header and once to do the partial transpose.-- Column definition, questions in this case DECLARE @cols AS NVARCHAR(MAX) DECLARE @query AS NVARCHAR(MAX) Declare @WithQuery as NVARCHAR(MAX) ; -- Question (column names) not to exceed 128, this is SQL Standard Set @WithQuery = ' with cte as ( SELECT e.ItemID as CandidateID, cq.CandidateCategoryID, left(cast(ac.CandidateCategoryID as varchar(10)) + '' - '' + ac.Name, 128) as Question, Case when eca.AnswerID is null then ''No'' else ''Yes'' End as Answer, ac.CategoryOrder as CategoryOrder, -1 as QuestionNumber FROM Candidate e CROSS JOIN OptionalQuestion cq LEFT JOIN OptionalAnswer eca ON eca.QuestionID = cq.QuestionID AND eca.CandidateID = e.CandidateID JOIN CandidateCategory ac ON (ac.CandidateCategoryID = cq.CandidateCategoryID) UNION -- Core Questions SELECT e.ItemID as CandidateID, -1, left(''Core Q'' + cast(cq.QuestionNumber as varchar(10)) + '' - '' + cq.Question, 128) as Question, eca.Answer, -1 as CategoryOrder, cq.QuestionNumber FROM Candidate e CROSS JOIN CoreQuestion cq LEFT JOIN CandidateCoreAnswers eca ON eca.QuestionID = cq.ItemID AND eca.CandidateID = e.ItemID WHERE (@Filter1 IS NULL OR e.Filter1 = @Filter1) AND (@Filter2 IS NULL OR e.Filter2 = @Filter2) AND (@Filter3 IS NULL OR e.Filter3 = @Filter3) UNION -- Optional Questions SELECT e.ItemID as CandidateID, cq.CandidateCategoryID, left(cast(ac.CandidateCategoryID as varchar(10)) + '' - Optional Q'' + cast(cq.QuestionNumber as varchar(10)) + '' - '' + cq.Question, 128) as Question, eca.Answer, ac.CategoryOrder as CategoryOrder, cq.QuestionNumber FROM Candidate e CROSS JOIN OptionalQuestion cq JOIN CandidateCategory ac ON (ac.ItemID = cq.CandidateCategoryID) LEFT JOIN OptionalAnswer eca ON eca.QuestionID = cq.QuestionID AND eca.CandidateID = e.CandidateID WHERE (@Filter1 IS NULL OR e.Filter1 = @Filter1) AND (@Filter2 IS NULL OR e.Filter2 = @Filter2) AND (@Filter3 IS NULL OR e.Filter3 = @Filter3) ) '
- All the required information is now stored in CTE
- Interesting point to note is column names must be less than or equal to 128 characters even though it is generated dynamically and not stored in sysname.
- Line 22, –1 as question number will display the category before the question
- Line 36, similarly, –1 as category order will display core questions before optional.
STUFF and QUOTENAME by XML PATH for column header
Set @query = @WithQuery + ' Select @cols = STUFF( (SELECT '','' + QUOTENAME(c.question) from ( select distinct CategoryOrder, QuestionNumber, question from cte ) c Order by CategoryOrder, QuestionNumber FOR XML PATH(''''), TYPE ).value(''.'', ''NVARCHAR(MAX)'') ,1,1,'''') ' exec sp_executesql @query, N'@Filter1 int, @Filter2 int, @Filter3 int, @cols nvarchar(max) output', @Filter1 = @Filter1, @Filter2 = @Filter2, @Filter3 = @Filter3, @cols = @cols output
- Line 1, reusing the CTE (should have called it @CTE but anyway)
- The result is a massive long string that has all the questions
Dynamic SQL
set @query = @WithQuery + ' SELECT e.info1, e.info2, e.info3, e.infoN, r.info, derived.* FROM Candidate e LEFT JOIN OtherInfo r ON (e.OtherInfoID = r.OtherInfoID) JOIN ( SELECT CandidateID, ' + @cols + ' from ( select CandidateID , isnull(answer, '''') as answer , question from cte ) x pivot ( max(answer) for question in (' + @cols + ') ) p ) derived on derived.CandidateID = e.CandidateID WHERE (@Filter1 IS NULL OR e.Filter1 = @Filter1) AND (@Filter2 IS NULL OR e.Filter2 = @Filter2) AND (@Filter3 IS NULL OR e.Filter3 = @Filter3) ; ' exec sp_executesql @query, N'@Filter1 int, @Filter2 int, @Filter3 int', @Filter1 = @Filter1, @Filter2 = @Filter2, @Filter3 = @Filter3
- Line 3, derived is the questions and answers that will be transposed
- Line 5, Candidate info can be drawn from Canddate table or join other tables for more information, just normal join stuff.
- Line 6 – 18 is the tough part for doing the pivot (transpose), notice how line 7 and 17 use @cols dynamically appended to the @Query
- Line 16 uses max for aggregation, noticed it became a limit that 1 candidate can only have 1 answer record per question.