The Signal class serves it purpose to get a raw data for a single signal. However, it makes it difficult to generalize in order to acquire the same signal across several circuits. In addition, typical processing requires to use additional classes and introduce even more local variables
circuit_type = 'RB'
circuit_name = 'RB.A12'
t_start = '2015-01-13 16:59:11+01:00'
t_end = '2015-01-13 17:15:46+01:00'
db = 'NXCALS'
system = 'PC'
metadata_pc = SignalMetadata.get_circuit_signal_database_metadata(circuit_type, circuit_name, system, db)
I_MEAS = SignalMetadata.get_signal_name(circuit_type, circuit_name, system, db, 'I_MEAS')
i_meas_nxcals_df = Signal().read(db, signal=I_MEAS, t_start=t_start, t_end=t_end,
nxcals_device=metadata_pc['device'], nxcals_property=metadata_pc['property'], nxcals_system=metadata_pc['system'],
spark=spark)
i_meas_nxcals_df = SignalUtilities.synchronize_df(i_meas_nxcals_df)
i_meas_nxcals_df = SignalUtilities.convert_indices_to_sec(i_meas_nxcals_df)
Several design flaws leading to inconsistency and code duplications:
- use of multiple methods, multiple arguments (duplicated across methods)
- multiple local variables (naming consistency across analysis modules)
- order of methods and arguments (with duck typing) not fixed
The API also does not answer in a generic way the following questions
- What if we want to get current for each circuit?
- What if we want to get several current signals?
Natural languages have certain structure [1]
Language | Word order | Example |
---|---|---|
English: | {Subject}.{Verb}.{Object}: | John ate cake |
Japanese: | {Subject}.{Order}.{Verb}: | John-ga keiki-o tabeta |
- | - | John cake ate |
One can enforce syntactical order in code:
- Domain Specific Language – new language, requires parser
- Embedded Domain Specific Language – extends existing language
Furthermore, an eDSL could be implemented following the Fluent interface approach [2]. The use of an eDSL for signal query and processing is not a new concept as there exists already an eDSL in Java used to automatically check signals during Hardware Commisionning campaigns of the LHC [3].
[1] K. Gulordava, Word order variation and dependency length minimisation: a cross-linguistic computational approach, PhD thesis, UniGe,
[2] https://en.wikipedia.org/wiki/Fluent_interface
[3] M. Audrain, et al. - Using a Java Embedded Domain-Specific Language for LHC Test Analysis, ICALEPCS2013, San Francisco, CA, USA
1. QueryBuilder()¶
We propose a python embedded Domain Specific Language (pyeDSL) for building queries:
- General purpose query
{DB}.{DURATION}.{QUERY_PARAMETERS}.{QUERY}
e.g.
df = QueryBuilder().{with_pm()/with_cals()/with_nxcals()}.with_duration().with_query_parameters()\
.signal_query().dfs[0]
df = QueryBuilder().{with_pm()/with_cals()/with_nxcals()}.with_timestamp().with_query_parameters()\
.signal_query().dfs[0]
- Circuit-oriented query to generalize query across and within circuit types
{DB}.{DURATION}.{CIRCUIT_TYPE}.{METADATA}.{QUERY}
e.g.
df = QueryBuilder().{with_pm()/with_cals()/with_nxcals()}.with_duration().with_circuit_type().with_metadata()\
.signal_query().dfs[0]
df = QueryBuilder().{with_pm()/with_cals()/with_nxcals()}.with_timestamp().with_circuit_type().with_metadata()\
.signal_query().dfs[0]
- each parameter defined once (validation of input at each stage)
- single local variable
- order of operation is fixed
- support for vector inputs
- time-dependent metadata
The pyeDSL provides hints on the order of execution
from lhcsmapi.pyedsl.QueryBuilder import QueryBuilder
QueryBuilder()
from lhcsmapi.pyedsl.QueryBuilder import QueryBuilder
QueryBuilder().with_nxcals(spark)
from lhcsmapi.pyedsl.QueryBuilder import QueryBuilder
QueryBuilder().with_nxcals(spark).with_duration(t_start='2015-03-13 05:20:59.4910002', duration=[(100, 's'), (100, 's')])
At the same time it prohibits unsupported operations throwing a meaningful exception
from lhcsmapi.pyedsl.QueryBuilder import QueryBuilder
QueryBuilder().with_duration(t_start='2015-03-13 05:20:59.4910002', duration=[(100, 's'), (100, 's')])
Sections 1.1 - 1.6. deal with: PM event and signal query, CALS signal query, NXCALS signal and feature query (as shown in Table 1). Each case is discussed with a general purpose query, where the user has to provide full data on signal name and its metadata as well as circuit-oriented queries which provide a generic way of querying LHC circuit variables. Both query types are polymorphic and complemented with a set of post-processing functions.
*Since CALS is about to be discontinued at CERN, we do not support this type of query. For feature query with pytimber, please consult https://gitlab.cern.ch/scripting-tools/pytimber
1.1. General-Purpose Query - Examples¶
A sentence constructed this way maintains the differences of query types while providing a common structure
- PM event query
from lhcsmapi.pyedsl.QueryBuilder import QueryBuilder
source_timestamp_df = QueryBuilder().with_pm() \
.with_duration(t_start='2015-03-13 05:20:59.4910002', duration=[(100, 's'), (100, 's')]) \
.with_query_parameters(system='FGC', className='51_self_pmd', source='RPTE.UA47.RB.A45') \
.event_query().df
source_timestamp_df
from lhcsmapi.pyedsl.QueryBuilder import QueryBuilder
source_timestamp_df = QueryBuilder().with_pm() \
.with_duration(t_start='2015-03-13 05:20:59.4910002', duration=[(24*60*60, 's')]) \
.with_query_parameters(system='QPS', className='DQAMCNMB_PMHSU', source='*') \
.event_query().df
source_timestamp_df
- PM signal query
from lhcsmapi.pyedsl.QueryBuilder import QueryBuilder
i_meas_df = QueryBuilder().with_pm() \
.with_timestamp(1426220469520000000) \
.with_query_parameters(system='FGC', source='RPTE.UA47.RB.A45', className='51_self_pmd', signal='STATUS.I_MEAS') \
.signal_query().dfs[0]
i_meas_df.plot()
- CALS signal query
from lhcsmapi.pyedsl.QueryBuilder import QueryBuilder
import pytimber
ldb = pytimber.LoggingDB()
i_meas_df = QueryBuilder().with_cals(ldb) \
.with_duration(t_start='2015-03-13 05:20:59.4910002', duration=[(100, 's'), (100, 's')]) \
.with_query_parameters(signal='RPTE.UA47.RB.A45:I_MEAS') \
.signal_query().dfs[0]
i_meas_df.plot()
- NXCALS signal query - device, property
from lhcsmapi.pyedsl.QueryBuilder import QueryBuilder
i_meas_df = QueryBuilder().with_nxcals(spark) \
.with_duration(t_start='2015-03-13 05:20:59.4910002', duration=[(100, 's'), (100, 's')]) \
.with_query_parameters(nxcals_system='CMW', nxcals_device='RPTE.UA47.RB.A45', nxcals_property='SUB', signal='I_MEAS') \
.signal_query().dfs[0]
i_meas_df.plot()
- NXCALS signal query - variable
from lhcsmapi.pyedsl.QueryBuilder import QueryBuilder
u_mag_df = QueryBuilder().with_nxcals(spark) \
.with_duration(t_start='2015-03-13 05:20:59.4910002', duration=[(100, 's'), (100, 's')]) \
.with_query_parameters(nxcals_system='WINCCOA', signal='DCBA.15R4.R:U_MAG') \
.signal_query().dfs[0]
u_mag_df.plot()
bmode_df = QueryBuilder().with_nxcals(spark) \
.with_duration(t_start='2015-03-13 05:20:59.4910002', t_end='2015-04-13 05:20:59.4910002') \
.with_query_parameters(nxcals_system='CMW', signal='HX:BMODE') \
.signal_query().dfs[0]
bmode_df.head()
- NXCALS feature query - device, property
I_MEAS from beam injection, through beam acceleration, to stable beams
from lhcsmapi.pyedsl.QueryBuilder import QueryBuilder
i_meas_df = QueryBuilder().with_nxcals(spark) \
.with_duration(t_start=1526898157236000000, t_end=1526903352338000000) \
.with_query_parameters(nxcals_system='CMW', nxcals_device='RPTE.UA23.RB.A12', nxcals_property='SUB', signal='I_MEAS') \
.feature_query(['mean', 'std', 'max', 'min', 'count']).df
i_meas_df
- NXCALS feature query - variable
U_MAG from beam injection, through beam acceleration, to stable beams
from lhcsmapi.pyedsl.QueryBuilder import QueryBuilder
u_mag_df = QueryBuilder().with_nxcals(spark) \
.with_duration(t_start=1526898157236000000, t_end=1526903352338000000) \
.with_query_parameters(nxcals_system='WINCCOA', signal='DCBB.8L2.R:U_MAG') \
.feature_query(['mean', 'std', 'max', 'min', 'count']).df
u_mag_df
1.2. General-Purpose Query - Polymorphism¶
- multiple signal, mutliple sources, multiple systems, multiple className
from lhcsmapi.pyedsl.QueryBuilder import QueryBuilder
i_meas_df = QueryBuilder().with_pm() \
.with_timestamp(1426220469520000000) \
.with_query_parameters(system=['FGC', 'FGC'], source=['RPTE.UA47.RB.A45', 'RPTE.UA47.RB.A45'],
className=['51_self_pmd', '51_self_pmd'], signal=['STATUS.I_MEAS', 'STATUS.I_REF']) \
.signal_query().dfs
ax=i_meas_df[0].plot()
i_meas_df[1].plot(ax=ax)
- multiple signal, single source, single system, single className
from lhcsmapi.pyedsl.QueryBuilder import QueryBuilder
i_meas_df = QueryBuilder().with_pm() \
.with_timestamp(1426220469520000000) \
.with_query_parameters(system='FGC', source='RPTE.UA47.RB.A45',
className='51_self_pmd', signal=['STATUS.I_MEAS', 'STATUS.I_REF']) \
.signal_query().dfs
ax=i_meas_df[0].plot()
i_meas_df[1].plot(ax=ax)
1.3. Circuit-Oriented Query - Examples¶
- PM event query
from lhcsmapi.pyedsl.QueryBuilder import QueryBuilder
source_timestamp_df = QueryBuilder().with_pm() \
.with_duration(t_start='2015-03-13 05:20:59.4910002', duration=[(100, 's'), (100, 's')]) \
.with_circuit_type('RB') \
.with_metadata(circuit_name='RB.A45', system='PC') \
.event_query().df
source_timestamp_df
from lhcsmapi.pyedsl.QueryBuilder import QueryBuilder
source_timestamp_df = QueryBuilder().with_pm() \
.with_duration(t_start='2015-03-13 05:20:59.4910002', duration=[(24*60*60, 's')]) \
.with_circuit_type('RB') \
.with_metadata(circuit_name='RB.A45', system='QH', source='*') \
.event_query().df
source_timestamp_df
- PM signal query
from lhcsmapi.pyedsl.QueryBuilder import QueryBuilder
i_meas_df = QueryBuilder().with_pm() \
.with_timestamp(1426220469520000000) \
.with_circuit_type('RB') \
.with_metadata(circuit_name='RB.A45', system='PC', signal='I_MEAS') \
.signal_query().dfs[0]
i_meas_df.plot()
- CALS signal query
from lhcsmapi.pyedsl.QueryBuilder import QueryBuilder
import pytimber
ldb = pytimber.LoggingDB()
i_meas_df = QueryBuilder().with_cals(ldb) \
.with_duration(t_start='2015-03-13 05:20:59.4910002', duration=[(100, 's'), (100, 's')]) \
.with_circuit_type('RB') \
.with_metadata(circuit_name='RB.A45', system='PC', signal='I_MEAS')\
.signal_query().dfs[0]
i_meas_df.plot()
- NXCALS signal query - device, property
from lhcsmapi.pyedsl.QueryBuilder import QueryBuilder
i_meas_df = QueryBuilder().with_nxcals(spark) \
.with_duration(t_start='2015-03-13 05:20:59.4910002', duration=[(100, 's'), (100, 's')]) \
.with_circuit_type('RB') \
.with_metadata(circuit_name='RB.A45', system='PC', signal='I_MEAS') \
.signal_query().dfs[0]
i_meas_df.plot()
- NXCALS signal query - variable
from lhcsmapi.pyedsl.QueryBuilder import QueryBuilder
u_mag_df = QueryBuilder().with_nxcals(spark) \
.with_duration(t_start='2015-03-13 05:20:59.4910002', duration=[(100, 's'), (100, 's')]) \
.with_circuit_type('RB') \
.with_metadata(circuit_name='RB.A45', system='BUSBAR', signal='U_MAG', wildcard={'BUSBAR': 'DCBA.15R4.R'}) \
.signal_query().dfs[0]
u_mag_df.plot()
- NXCALS feature query - device, property
I_MEAS from beam injection, through beam acceleration, to stable beams
from lhcsmapi.pyedsl.QueryBuilder import QueryBuilder
i_meas_df = QueryBuilder().with_nxcals(spark) \
.with_duration(t_start=1526898157236000000, t_end=1526903352338000000) \
.with_circuit_type('RB') \
.with_metadata(circuit_name='RB.A12', system='PC', signal='I_MEAS') \
.feature_query(['mean', 'std', 'max', 'min', 'count']).df
i_meas_df
- NXCALS feature query - variable
U_MAG from beam injection, through beam acceleration, to stable beams
from lhcsmapi.pyedsl.QueryBuilder import QueryBuilder
u_mag_df = QueryBuilder().with_nxcals(spark) \
.with_duration(t_start=1526898157236000000, t_end=1526903352338000000) \
.with_circuit_type('RB') \
.with_metadata(circuit_name='RB.A12', system='BUSBAR', signal='U_MAG', wildcard={'BUSBAR': 'DCBB.8L2.R'}) \
.feature_query(['mean', 'std', 'max', 'min', 'count']).df
u_mag_df
1.4. Circuit-Oriented Query - Polymorphism¶
- Multiple circuit names
from lhcsmapi.pyedsl.QueryBuilder import QueryBuilder
i_meas_dfs = QueryBuilder().with_nxcals(spark) \
.with_duration(t_start='2015-03-13 05:20:59.4910002', duration=[(100, 's'), (100, 's')]) \
.with_circuit_type('RB') \
.with_metadata(circuit_name=['RB.A12', 'RB.A45'], system='PC', signal='I_MEAS')\
.signal_query().dfs
ax = i_meas_dfs[0].plot()
i_meas_dfs[1].plot(ax=ax, grid=True)
- Multiple system names
from lhcsmapi.pyedsl.QueryBuilder import QueryBuilder
u_hts_dfs = QueryBuilder().with_nxcals(spark) \
.with_duration(t_start=1544622149598000000, duration=[(50, 's'), (150, 's')]) \
.with_circuit_type('RB') \
.with_metadata(circuit_name='RB.A12', system=['LEADS_EVEN', 'LEADS_ODD'], signal='U_HTS') \
.signal_query().dfs
ax = u_hts_dfs[0].plot()
u_hts_dfs[1].plot(ax=ax)
- Multiple signal names
from lhcsmapi.pyedsl.QueryBuilder import QueryBuilder
u_ext_dfs = QueryBuilder().with_pm() \
.with_timestamp(1544622149598000000) \
.with_circuit_type('RQ') \
.with_metadata(circuit_name='RQD.A12', system='QDS', signal=['U_1_EXT', 'U_2_EXT'],
source='16L2', wildcard={'CELL': '16L2'}) \
.signal_query().dfs
ax = u_ext_dfs[0].plot()
u_ext_dfs[1].plot(ax=ax)
- Signal wildcard
from lhcsmapi.pyedsl.QueryBuilder import QueryBuilder
import matplotlib.pyplot as plt
import pytimber
ldb = pytimber.LoggingDB()
u_diode_rqd_dfs = QueryBuilder().with_cals(ldb) \
.with_duration(t_start=int(1544622149613000000), duration=[(50, 's'), (150, 's')]) \
.with_circuit_type('RQ') \
.with_metadata(circuit_name='RQD.A12', system='DIODE_RQD', signal='U_DIODE_RQD', wildcard={'MAGNET': '*'})\
.signal_query().dfs
# plot all
fig, ax = plt.subplots()
for u_diode_rqd_df in u_diode_rqd_dfs:
u_diode_rqd_df.plot(ax=ax)
ax.legend().set_visible(False)
1.5. Advanced Feature Query¶
NXCALS enables calculation of signal features such as min, max, mean, std, count directly on the cluster without the need for costly query of the signal and performing calculation locally. This approach enables parallel computing on the cluster. To this end, a query should contain an element enabling a group by operation. Each group by operation allows for executing computation in parallel. For the sake of compactness, we only show examples for circuit-oriented query, however, the same principle applies to the general-purpose queries.
- Feature query of multiple signals for the same period of time
from lhcsmapi.pyedsl.QueryBuilder import QueryBuilder
i_meas_df = QueryBuilder().with_nxcals(spark) \
.with_duration(t_start=1526898157236000000, t_end=1526903352338000000) \
.with_circuit_type('RB') \
.with_metadata(circuit_name='*', system='PC', signal='I_MEAS') \
.feature_query(['mean', 'std', 'max', 'min', 'count']).df
i_meas_df
- Feature query of multiple signals with the same period of time subdivided into three intervals - group by signal name and interval
t_start_injs = 1526898157236000000
t_end_injs = 1526899957236000000
t_start_sbs = 1526901552338000000
t_end_sbs = 1526903352338000000
from lhcsmapi.pyedsl.QueryBuilder import QueryBuilder
i_meas_df = QueryBuilder().with_nxcals(spark) \
.with_duration(t_start=t_start_injs, t_end=t_end_sbs) \
.with_circuit_type('RB') \
.with_metadata(circuit_name='RB.A12', system='PC', signal='I_MEAS')\
.signal_query().dfs[0]
ax = i_meas_df.plot(figsize=(10,5), linewidth=5)
ax.axvspan(xmin=t_start_injs, xmax=t_end_injs, facecolor='xkcd:goldenrod')
ax.axvspan(xmin=t_end_injs, xmax=t_start_sbs, facecolor='xkcd:grey')
ax.axvspan(xmin=t_start_sbs, xmax=t_end_sbs, facecolor='xkcd:green')
Function translate introduces a mapping based on the time column. Here, we consider three subintervals for beam injection, beam acceleration, and stable beams.
As a result, the time column forms a partition and can be executed in parallel.
from pyspark.sql.functions import udf
from pyspark.sql.types import IntegerType
def translate(timestamp):
if(timestamp >= t_start_injs and timestamp < t_end_injs):
return 1
if(timestamp >= t_end_injs and timestamp < t_start_sbs):
return 2
if(timestamp >= t_start_sbs and timestamp <= t_end_sbs):
return 3
return -1
translate_udf = udf(translate, IntegerType())
The translate function should be passed as a function argument
i_meas_df = QueryBuilder().with_nxcals(spark) \
.with_duration(t_start=t_start_injs, t_end=t_end_sbs) \
.with_circuit_type('RB') \
.with_metadata(circuit_name='*', system='PC', signal='I_MEAS') \
.feature_query(['mean', 'std', 'max', 'min', 'count'], function=translate_udf).df
i_meas_df
The same method applied to NXCALS signals based on variable
from lhcsmapi.pyedsl.QueryBuilder import QueryBuilder
u_mag_ab_df = QueryBuilder().with_nxcals(spark) \
.with_duration(t_start=t_start_injs, t_end=t_end_sbs) \
.with_circuit_type('RB') \
.with_metadata(circuit_name='RB.A12', system='BUSBAR', signal='U_MAG', wildcard={'BUSBAR': 'DCBB.8L2.R'}) \
.feature_query(['mean', 'std', 'max', 'min', 'count'], function=translate_udf).df
u_mag_ab_df
This method can be used together with signal decimation, i.e., taking every nth sample.
For example this can be useful to query QPS board A and B which share the same channel and samples are shifted by 5 so that
- every 10-th sample belongs to board A (or B), decimation=10
from lhcsmapi.pyedsl.QueryBuilder import QueryBuilder
u_mag_a_df = QueryBuilder().with_nxcals(spark) \
.with_duration(t_start=t_start_injs, t_end=t_end_sbs) \
.with_circuit_type('RB') \
.with_metadata(circuit_name='RB.A12', system='BUSBAR', signal='U_MAG', wildcard={'BUSBAR': 'DCBB.8L2.R'}) \
.feature_query(['mean', 'std', 'max', 'min', 'count'], function=translate_udf, decimation=10).df
u_mag_a_df
- every 5+10-th sample belongs to board B (or A), decimation=10, shift=5
from lhcsmapi.pyedsl.QueryBuilder import QueryBuilder
u_mag_b_df = QueryBuilder().with_nxcals(spark) \
.with_duration(t_start=t_start_injs, t_end=t_end_sbs) \
.with_circuit_type('RB') \
.with_metadata(circuit_name='RB.A12', system='BUSBAR', signal='U_MAG', wildcard={'BUSBAR': 'DCBB.8L2.R'}) \
.feature_query(['mean', 'std', 'max', 'min', 'count'], function=translate_udf, decimation=10, shift=5).df
u_mag_b_df
- with polymorphism one can query 1248 busbar at once (in two batches of 624 due to the limit of 1000 signal per query)
from lhcsmapi.pyedsl.QueryBuilder import QueryBuilder
u_mag_ab_df = QueryBuilder().with_nxcals(spark) \
.with_duration(t_start=t_start_injs, t_end=t_end_sbs) \
.with_circuit_type('RB') \
.with_metadata(circuit_name='*', system='BUSBAR', signal='U_MAG', wildcard={'BUSBAR': '*'}) \
.feature_query(['mean', 'std', 'max', 'min', 'count'], function=translate_udf).df
u_mag_ab_df.head()
1.6. Processing Raw Signals¶
Once a signal is queried, one can perform some operations on each of them.
In this case, the order of operations does not matter (but can be checked).
Signal query | Signal processing |
---|---|
{DB}.{DURATION}.{QUERY_PARAMETERS}.{QUERY} | |
{DB}.{DURATION}.{CIRCUIT_TYPE}.{METADATA}.{QUERY} | |
.synchronize_time() | |
.convert_index_to_sec() | |
.create_col_from_index() | |
.filter_median() | |
.remove_values_for_time_less_than() | |
.remove_initial_offset() |
from lhcsmapi.pyedsl.QueryBuilder import QueryBuilder
i_meas_df = QueryBuilder().with_nxcals(spark) \
.with_duration(t_start='2015-01-13 16:59:11+01:00', t_end='2015-01-13 17:15:46+01:00') \
.with_circuit_type('RB') \
.with_metadata(circuit_name='RB.A12', system='PC', signal='I_MEAS') \
.signal_query() \
.synchronize_time() \
.convert_index_to_sec().dfs[0]
i_meas_df.plot()
1.7. Processing Raw Events¶
For PM event queries one can perform several operations on source, timestamp dataframe.
Event query | Event processing |
---|---|
{DB}.{DURATION}.{QUERY_PARAMETERS}.{QUERY} | |
{DB}.{DURATION}.{CIRCUIT_TYPE}.{METADATA}.{QUERY} | |
.filter_source() | |
.drop_duplicate_source() | |
.sort_values() |
The processing methods are dedicated to performing repeated operations on PM events. In case of searching a given system and className with wildcard '*' as a source, the event query can return events from different sectors. In this case, one can filter events to contain to a given sector. Some PM systems return duplicate events from different types of boards. In this case one can drop duplicate sources. Eventually, the events can be sorted by either source or timestamp.
- Filter source
from lhcsmapi.pyedsl.QueryBuilder import QueryBuilder
source_timestamp_df = QueryBuilder().with_pm() \
.with_duration(t_start='2015-03-13 05:20:59.4910002', duration=[(24*60*60, 's')]) \
.with_circuit_type('RB') \
.with_metadata(circuit_name='RB.A45', system='QH', source='*') \
.event_query() \
.df
source_timestamp_df
Executing filter_source() with circuit name and system type would filter out events not belonging to a given circuit name
from lhcsmapi.pyedsl.QueryBuilder import QueryBuilder
source_timestamp_df = QueryBuilder().with_pm() \
.with_duration(t_start='2015-03-13 05:20:59.4910002', duration=[(24*60*60, 's')]) \
.with_circuit_type('RB') \
.with_metadata(circuit_name='RB.A45', system='QH', source='*') \
.event_query() \
.filter_source('RB.A45', 'QH') \
.df
source_timestamp_df
- Drop duplicates
from lhcsmapi.pyedsl.QueryBuilder import QueryBuilder
source_timestamp_df = QueryBuilder().with_pm() \
.with_duration(t_start='2015-03-13 05:20:59.4910002', duration=[(24*60*60, 's')]) \
.with_circuit_type('RB') \
.with_metadata(circuit_name='RB.A45', system='QDS', source='*') \
.event_query() \
.df
source_timestamp_df
Executing filter_source() with circuit name and system type would filter out events not belonging to a given circuit name
from lhcsmapi.pyedsl.QueryBuilder import QueryBuilder
source_timestamp_df = QueryBuilder().with_pm() \
.with_duration(t_start='2015-03-13 05:20:59.4910002', duration=[(24*60*60, 's')]) \
.with_circuit_type('RB') \
.with_metadata(circuit_name='RB.A45', system='QDS', source='*') \
.event_query() \
.filter_source('RB.A45', 'QDS') \
.df
source_timestamp_df
- drop_duplicate_source()
Some PM systems return duplicate events from different types of boards. In this case one can drop duplicate sources.
from lhcsmapi.pyedsl.QueryBuilder import QueryBuilder
source_timestamp_df = QueryBuilder().with_pm() \
.with_duration(t_start='2015-03-13 05:20:59.4910002', duration=[(24*60*60, 's')]) \
.with_circuit_type('RB') \
.with_metadata(circuit_name='RB.A45', system='QDS', source='*') \
.event_query() \
.filter_source('RB.A45', 'QDS') \
.drop_duplicate_source() \
.df
source_timestamp_df
- sort_values()
The events can be sorted by either source or timestamp.
from lhcsmapi.pyedsl.QueryBuilder import QueryBuilder
source_timestamp_df = QueryBuilder().with_pm() \
.with_duration(t_start='2015-03-13 05:20:59.4910002', duration=[(24*60*60, 's')]) \
.with_circuit_type('RB') \
.with_metadata(circuit_name='RB.A45', system='QDS', source='*') \
.event_query() \
.filter_source('RB.A45', 'QDS') \
.drop_duplicate_source() \
.sort_values(by='timestamp') \
.df
source_timestamp_df
2. AssertionBuilder()¶
Signal input | Time range definition (optional) / Signal assertion | Signal assertions (if time range defined) |
---|---|---|
.with_signal() | ||
.has_min_max_value() | ||
.compare_to_reference() | ||
.with_time_range() | .has_min_max_variation() | |
.with_time_range() | .has_min_max_slope() |
- has_min_max_value()
from lhcsmapi.pyedsl.QueryBuilder import QueryBuilder
from lhcsmapi.pyedsl.AssertionBuilder import AssertionBuilder
tt891a_dfs = QueryBuilder().with_nxcals(spark) \
.with_duration(t_start='2014-12-13 09:12:41+01:00', t_end='2014-12-13 12:27:11+01:00') \
.with_circuit_type('RB') \
.with_metadata(circuit_name='RB.A12', system=['LEADS_EVEN', 'LEADS_ODD'], signal='TT891A') \
.signal_query() \
.synchronize_time() \
.convert_index_to_sec() \
.filter_median().dfs
AssertionBuilder().with_signal(tt891a_dfs) \
.has_min_max_value(value_min=46, value_max=54)
In the case below, the assertion should fail and raise a warning.
AssertionBuilder().with_signal(tt891a_dfs) \
.has_min_max_value(value_min=50, value_max=54)
- compare_to_reference()
from lhcsmapi.pyedsl.QueryBuilder import QueryBuilder
from lhcsmapi.pyedsl.AssertionBuilder import AssertionBuilder
from lhcsmapi.reference.Reference import Reference
from lhcsmapi.Time import Time
timestamp_ee_rqd = 1544622149701000000
timestamp_fgc_rqd = 1544622149620000000
signal_names = 'T_RES'
t_res_df = QueryBuilder().with_pm() \
.with_timestamp(timestamp_ee_rqd) \
.with_circuit_type('RQ') \
.with_metadata(circuit_name='RQD.A12', system='EE', signal=signal_names).signal_query() \
.remove_values_for_time_less_than(timestamp_ee_rqd) \
.synchronize_time(timestamp_fgc_rqd) \
.convert_index_to_sec().dfs[0]
timestamp_ee_ref_rqd = Reference.get_power_converter_reference_fpa('RQ', 'RQD.A12', 'eePm')
timestamp_ee_ref_rqd = Time.to_unix_timestamp(timestamp_ee_ref_rqd)
timestamp_fgc_ref_rqd = Reference.get_power_converter_reference_fpa('RQ', 'RQD.A12', 'fgcPm')
timestamp_fgc_ref_rqd = Time.to_unix_timestamp(timestamp_fgc_ref_rqd)
t_res_ref_df = QueryBuilder().with_pm() \
.with_timestamp(timestamp_ee_ref_rqd) \
.with_circuit_type('RQ') \
.with_metadata(circuit_name='RQD.A12', system='EE', signal=signal_names).signal_query() \
.remove_values_for_time_less_than(timestamp_ee_ref_rqd) \
.synchronize_time(timestamp_fgc_ref_rqd) \
.convert_index_to_sec().dfs[0]
AssertionBuilder().with_signal([t_res_df])\
.compare_to_reference(signal_ref_dfs=[t_res_ref_df], abs_margin=25, scaling=1)
- has_min_max_variation()
from lhcsmapi.pyedsl.QueryBuilder import QueryBuilder
from lhcsmapi.pyedsl.AssertionBuilder import AssertionBuilder
from lhcsmapi.analysis.CircuitAnalysis import get_current_plateau_start_end
t_start = '2014-12-13 09:12:41+01:00'
t_end = '2014-12-13 12:27:11+01:00'
cv891_dfs = QueryBuilder().with_nxcals(spark) \
.with_duration(t_start=t_start, t_end=t_end) \
.with_circuit_type('RB') \
.with_metadata(circuit_name='RB.A12', system=['LEADS_EVEN', 'LEADS_ODD'], signal='CV891') \
.signal_query() \
.synchronize_time() \
.convert_index_to_sec() \
.filter_median() \
.dfs
i_meas_raw_df = QueryBuilder().with_nxcals(spark) \
.with_duration(t_start=t_start, t_end=t_end) \
.with_circuit_type('RB') \
.with_metadata(circuit_name='RB.A12', system='PC', signal='I_MEAS') \
.signal_query() \
.dfs[0]
plateau_timing_df = get_current_plateau_start_end(i_meas_raw_df, i_meas_threshold=500)
AssertionBuilder().with_signal(cv891_dfs) \
.with_time_range(t_start=plateau_timing_df['plateau_start_sync'], t_end=plateau_timing_df['plateau_end_sync']) \
.has_min_max_variation(variation_min_max=8)
In the case below, the variation is too tight and the assertion fails
AssertionBuilder().with_signal(cv891_dfs) \
.with_time_range(t_start=plateau_timing_df['plateau_start_sync'], t_end=plateau_timing_df['plateau_end_sync']) \
.has_min_max_variation(variation_min_max=1)
- has_min_max_slope()
from lhcsmapi.pyedsl.QueryBuilder import QueryBuilder
from lhcsmapi.pyedsl.AssertionBuilder import AssertionBuilder
from lhcsmapi.analysis.CircuitAnalysis import get_current_plateau_start_end
t_start = '2014-12-13 09:12:41+01:00'
t_end = '2014-12-13 12:27:11+01:00'
u_res_dfs = QueryBuilder().with_nxcals(spark) \
.with_duration(t_start=t_start, t_end=t_end) \
.with_circuit_type('RB') \
.with_metadata(circuit_name='RB.A12', system=['LEADS_EVEN', 'LEADS_ODD'], signal='U_RES') \
.signal_query() \
.synchronize_time() \
.convert_index_to_sec() \
.filter_median() \
.dfs
i_meas_raw_df = QueryBuilder().with_nxcals(spark) \
.with_duration(t_start=t_start, t_end=t_end) \
.with_circuit_type('RB') \
.with_metadata(circuit_name='RB.A12', system='PC', signal='I_MEAS') \
.signal_query() \
.dfs[0]
plateau_timing_df = get_current_plateau_start_end(i_meas_raw_df, i_meas_threshold=500)
AssertionBuilder().with_signal(u_res_dfs) \
.with_time_range(t_start=plateau_timing_df['plateau_start_sync'], t_end=plateau_timing_df['plateau_end_sync']) \
.has_min_max_slope(slope_min=-2, slope_max=2)
In the case below, the slope is too tight and the assertion fails
AssertionBuilder().with_signal(u_res_dfs) \
.with_time_range(t_start=plateau_timing_df['plateau_start_sync'], t_end=plateau_timing_df['plateau_end_sync']) \
.has_min_max_slope(slope_min=-2e-3, slope_max=2e-3)
3. FeatureBuilder()¶
e.g.
FeatureBuilder().with_signal(u_hds_dfs) \
.calculate_features(features=['first', 'last20mean', 'tau_charge'], index=1544622149599000000)
Supported functions are:
['first', 'first20mean', 'last', 'last20mean', 'max', 'min', 'median', 'std', 'mean', 'tau_charge', 'tau_energy', 'tau_lin_reg', 'tau_exp_fit']
For example, to calculate initial voltage, final mean voltage based on the last 20 points, and the characteristic time of the pseudo-exponential decay for a quench heater voltage.
from lhcsmapi.pyedsl.QueryBuilder import QueryBuilder
from lhcsmapi.pyedsl.FeatureBuilder import FeatureBuilder
import matplotlib.pyplot as plt
timestamp = 1544622149599000000
u_hds_dfs = QueryBuilder().with_pm() \
.with_timestamp(timestamp) \
.with_circuit_type('RQ') \
.with_metadata(circuit_name='RQD.A12', system='QH', signal='U_HDS', source='16L2', wildcard={'CELL': '16L2'}) \
.signal_query()\
.synchronize_time(timestamp)\
.convert_index_to_sec().dfs
ax = u_hds_dfs[0].plot(figsize=(15,7))
u_hds_dfs[1].plot(ax=ax, grid=True)
plt.show()
FeatureBuilder().with_signal(u_hds_dfs) \
.calculate_features(features=['first', 'last20mean', 'tau_charge'], index=1544622149599000000)