Skip to content

Commit

Permalink
Merge pull request #27 from mlibrary/callumber_fix_Jan_2024
Browse files Browse the repository at this point in the history
Fix callnumber logic in java plugin
  • Loading branch information
niquerio authored Feb 9, 2024
2 parents eb20a39 + d568e27 commit 6bb2c2c
Show file tree
Hide file tree
Showing 3 changed files with 38 additions and 16 deletions.
17 changes: 15 additions & 2 deletions solr/Dockerfile
Original file line number Diff line number Diff line change
@@ -1,12 +1,25 @@
FROM solr:8.11.2

# Configure for basic auth
ENV SOLR_AUTH_TYPE="basic"
ENV SOLR_AUTHENTICATION_OPTS="-Dbasicauth=solr:SolrRocks"
ENV SOLR_OPTS="-Denable.packages=true"


COPY --chown=solr:solr lib/library_identifier_solr_filters-0.9.6-solr8.8.2.jar /var/solr/lib/library_identifier_solr_filters-0.9.6-solr8.8.2.jar
COPY --chown=solr:solr lib/solr_analyzed_string-1.0.jar /var/solr/lib/solr_analyzed_string-1.0.jar
# One of the places solr will always look for .jar files, and where zookeeper
# will always make sure it can find them, is in `solr.solr.home`/lib.
# We'll go ahead and make that directory, and put all our jars in it.

ENV SOLR_HOME=/var/solr/data
ENV SOLR_LIB=/var/solr/data/lib
RUN mkdir -p $SOLR_LIB

# Now copy everything from our local solr/lib to that location, so all the
# replicas can find them

COPY --chown=solr:solr lib/*.jar $SOLR_LIB

# Set up a security.json so we can actually log in

COPY --chown=solr:solr dev_init/security.json /var/solr/data/security.json
COPY --chown=solr:solr dev_init/solr_init.sh /usr/bin/solr_init.sh
Expand Down
37 changes: 23 additions & 14 deletions solr/call_number_browse/conf/schema/callnumbers.xml
Original file line number Diff line number Diff line change
Expand Up @@ -3,40 +3,47 @@
Callnumbers (LC and Dewey)
Our callnumber uses custom solr java code
([email protected]:billdueber/library_identifier_solr_filters) that normalizes
LC and Dewey callnumbers for more accurate searching.
([email protected]:mlibrary/edu-umich-lib-solrMegapack) that (very roughly) normalizes
LC and Dewey callnumbers for more useful searching and sorting.
We expose two custom field types and one analysis filter:
* `edu.umich.library.library_identifier.schema.CallnumberSortableFieldType`
* `edu.umich.lib.solr.libraryIdentifier.callnumber.fieldType.CallnumberSortableFieldType`
will store the value as given, and index a value that is suitable
for exact matches (callnumber:"QA 11.2 .C3"), range queries
(callnumber:[* TO "QA 11.2 .C3"]), and sorting (if single-valued). It
takes an argument "passThroughInvalid" (default: false) to determine
whether or not to ignore (default) or pass through normalized values that don't
look like call numbers.
* `edu.umich.library.library_identifier.schema.CallNumberSortKeyFieldType`
* `edu.umich.lib.solr.libraryIdentifier.callnumber.fieldType.CallNumberSortKeyFieldType`
is much like the above but the stored value is the same as the indexed value.
* `edu.umich.library.lucene.analysis.AnyCallNumberSimpleFilterFactory`
* `edu.umich.lib.solr.libraryIdentifier.callnumber.analysis.AnyCallNumberSimpleFilterFactory`
does the same transformation on a single token (so, use KeywordTokenizer)
as above, but in the analysis chain so it can be combined with
as above, but in the analysis chain, so it can be combined with
`solr.EdgeNGramFilterFactory` to get prefix ("starts-with") searches for LC/Dewey.
============================================================== -->


<!-- For sorting and range queries. Stored is what you send it, indexed is the sortable key -->
<fieldType name="any_callnumber" class="edu.umich.library.library_identifier.schema.CallnumberSortableFieldType"
docValues="false" multiValued="true" stored="true" passThroughOnError="true"/>
<fieldType name="any_callnumber"
class="edu.umich.lib.solr.libraryIdentifier.callnumber.fieldType.CallnumberSortableFieldType"
passThroughOnError="true"
docValues="false" multiValued="true" stored="true"
/>

<fieldType name="any_callnumber_strict" class="edu.umich.library.library_identifier.schema.CallnumberSortableFieldType"
<fieldType name="any_callnumber_strict"
class="edu.umich.lib.solr.libraryIdentifier.callnumber.fieldType.CallnumberSortableFieldType"
docValues="false" multiValued="true" stored="true"/>

<!-- As above, but the index sort key is what's exposed as the stored version, too -->
<fieldType name="any_callnumber_sort_key" class="edu.umich.library.library_identifier.schema.CallNumberSortKeyFieldType"
docValues="false" multiValued="true" stored="true" passThroughOnError="true"/>
<fieldType name="any_callnumber_sort_key"
class="edu.umich.lib.solr.libraryIdentifier.callnumber.fieldType.CallNumberSortKeyFieldType"
passThroughOnError="true"
docValues="false" multiValued="true" stored="true"/>

<fieldType name="any_callnumber_sort_key_strict" class="edu.umich.library.library_identifier.schema.CallNumberSortKeyFieldType"
<fieldType name="any_callnumber_sort_key_strict"
class="edu.umich.lib.solr.libraryIdentifier.callnumber.fieldType.CallNumberSortKeyFieldType"
docValues="false" multiValued="true" stored="true"/>


Expand All @@ -45,12 +52,14 @@
<analyzer type="index">
<tokenizer class="solr.KeywordTokenizerFactory"/>
<filter class="solr.ICUFoldingFilterFactory"/>
<filter class="edu.umich.library.lucene.analysis.AnyCallNumberSimpleFilterFactory" passThroughOnError="false"/>
<filter class="edu.umich.lib.solr.libraryIdentifier.callnumber.analysis.AnyCallNumberSimpleFilterFactory"
passThroughOnError="false"/>
<filter class="solr.EdgeNGramFilterFactory" maxGramSize="40" minGramSize="1"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.KeywordTokenizerFactory"/>
<filter class="solr.ICUFoldingFilterFactory"/>
<filter class="edu.umich.library.lucene.analysis.AnyCallNumberSimpleFilterFactory" passThroughOnError="true"/>
<filter class="edu.umich.lib.solr.libraryIdentifier.callnumber.analysis.AnyCallNumberSimpleFilterFactory"
passThroughOnError="true"/>
</analyzer>
</fieldType>
Binary file not shown.

0 comments on commit 6bb2c2c

Please sign in to comment.