taskprocessor: Enable subsystems and overload by subsystem

To prevent one subsystem's taskprocessors from causing others
to stall, new capabilities have been added to taskprocessors.

* Any taskprocessor name that has a '/' will have the part
  before the '/' saved as its "subsystem".
  Examples:
  "sorcery/acl-0000006a" and "sorcery/aor-00000019"
  will be grouped to subsystem "sorcery".
  "pjsip/distributor-00000025" and "pjsip/distributor-00000026"
  will bn grouped to subsystem "pjsip".
  Taskprocessors with no '/' have an empty subsystem.

* When a taskprocessor enters high-water alert status and it
  has a non-empty subsystem, the subsystem alert count will
  be incremented.

* When a taskprocessor leaves high-water alert status and it
  has a non-empty subsystem, the subsystem alert count will be
  decremented.

* A new api ast_taskprocessor_get_subsystem_alert() has been
  added that returns the number of taskprocessors in alert for
  the subsystem.

* A new CLI command "core show taskprocessor alerted subsystems"
  has been added.

* A new unit test was addded.

REMINDER: The taskprocessor code itself doesn't take any action
based on high-water alerts or overloading.  It's up to taskprocessor
users to check and take action themselves.  Currently only the pjsip
distributor does this.

* A new pjsip/global option "taskprocessor_overload_trigger"
  has been added that allows the user to select the trigger
  mechanism the distributor uses to pause accepting new requests.
  "none": Don't pause on any overload condition.
  "global": Pause on ANY taskprocessor overload (the default and
  current behavior)
  "pjsip_only": Pause only on pjsip taskprocessor overloads.

* The core pjsip pool was renamed from "SIP" to "pjsip" so it can
  be properly grouped into the "pjsip" subsystem.

* stasis taskprocessor names were changed to "stasis" as the
  subsystem.

* Sorcery core taskprocessor names were changed to "sorcery" to
  match the object taskprocessors.

Change-Id: I8c19068bb2fc26610a9f0b8624bdf577a04fcd56
This commit is contained in:
George Joseph
2019-02-15 11:53:50 -07:00
parent da93d17af8
commit bae3fd04c1
13 changed files with 525 additions and 12 deletions

View File

@@ -0,0 +1,42 @@
"""taskprocessor_overload_trigger
Revision ID: f3c0b8695b66
Revises: 0838f8db6a61
Create Date: 2019-02-15 15:03:50.106790
"""
# revision identifiers, used by Alembic.
revision = 'f3c0b8695b66'
down_revision = '0838f8db6a61'
from alembic import op
import sqlalchemy as sa
from sqlalchemy.dialects.postgresql import ENUM
PJSIP_TASKPROCESSOR_OVERLOAD_TRIGGER_NAME = 'pjsip_taskprocessor_overload_trigger_values'
PJSIP_TASKPROCESSOR_OVERLOAD_TRIGGER_VALUES = ['none', 'global', 'pjsip_only']
def upgrade():
context = op.get_context()
if context.bind.dialect.name == 'postgresql':
enum = ENUM(*PJSIP_TASKPROCESSOR_OVERLOAD_TRIGGER_VALUES,
name=PJSIP_TASKPROCESSOR_OVERLOAD_TRIGGER_NAME)
enum.create(op.get_bind(), checkfirst=False)
op.add_column('ps_globals',
sa.Column('taskprocessor_overload_trigger',
sa.Enum(*PJSIP_TASKPROCESSOR_OVERLOAD_TRIGGER_VALUES,
name=PJSIP_TASKPROCESSOR_OVERLOAD_TRIGGER_NAME,
create_type=False)))
def downgrade():
if op.get_context().bind.dialect.name == 'mssql':
op.drop_constraint('ck_ps_globals_taskprocessor_overload_trigger_pjsip_taskprocessor_overload_trigger_values', 'ps_globals')
op.drop_column('ps_globals', 'taskprocessor_overload_trigger')
if context.bind.dialect.name == 'postgresql':
enum = ENUM(*PJSIP_TASKPROCESSOR_OVERLOAD_TRIGGER_VALUES,
name=PJSIP_TASKPROCESSOR_OVERLOAD_TRIGGER_NAME)
enum.drop(op.get_bind(), checkfirst=False)